Implement Phi-3-vision-128k-instruct as langchain agent

Pretbc · June 7, 2024, 1:04pm

Hello

is there any one that can help me to understand and implement:

microsoft/Phi-3-vision-128k-instruct

into langchain as an agent. I cannot figur it out how to init agent and pass image together with prompt.

Langchain HuggingFacePipeline seems to not implemented such feature yet.
Same with Ollama. I don’t wanna use Azure.

Maybe there is some other possibility to somehow inherent from langchain Agent and create custom class which will invoke as input prompt with message and preprocess this as authors of microsoft/Phi-3-vision-128k-instruct provide as example ?

TY

Topic		Replies	Views
meta-llama/Llama-3.2-11B-Vision "please acept " Models	3	14	March 6, 2025
Multimodal LLM with Image and Text sequentially in its prompt 🤗Transformers	2	12218	January 1, 2024
Is only inference provider :HF Inference API >> permit API Call succefully for any model with fixed URL pattern <f"https://api-inference.huggingface.co/models/{repo_id}"> Beginners	2	10	July 16, 2025
For helping doctors! Please help me finetune Phi3 on the following dataset: openlifescienceai/medmcqa 🤗Transformers	0	45	November 20, 2024
OpenAI AI Assistant Alternative Using HuugingFace Models Intermediate	0	283	December 7, 2023

Implement Phi-3-vision-128k-instruct as langchain agent

Related topics