Image to text model that can take an additional text input

nadnadoni1234 September 7, 2023, 9:05am 1

Hi, does anyone can recommend an image to text model that can take an additional text input for adding context prior for generating the caption?

Topic		Replies	Views
Image to Text model that can take an additional text as input for context 🤗Hub	1	493	September 5, 2023
Image Captioning fine tuning 🤗Transformers	0	440	February 25, 2023
Inference provider for captioning (image2text model) Beginners	3	35	June 16, 2025
Multimodal LLM with Image and Text sequentially in its prompt 🤗Transformers	2	12455	January 1, 2024
Inference Api free rate limit Inference Endpoints on the Hub	0	1927	May 20, 2023