Hi all, I want to use phi-3.5-text-img-text for a multi modal task which takes the image of a page and converts the contents of the page into html tags and if there is an image found on the page, it will convert that image into text and add ‘illustration’ tag beside it. How do I use the huggingface…

How do I use Text-Image to Text models with Huggingface Inference?

John6666 October 12, 2024, 1:24pm 2

It’s in the manual, but it’s a newly implemented pipeline, so I don’t know if it really works.

Topic		Replies	Views
Text To Image Interference Providers Beginners	1	36	April 13, 2025
Inference provider for captioning (image2text model) Beginners	3	33	June 16, 2025
Text to image models not yet supported? Inference Endpoints on the Hub	0	552	November 28, 2022
Models with inference_client Beginners	1	64	January 28, 2025
How do I use text-to-image huggingface models as an API for my website? Beginners	1	6220	April 20, 2023