I want to use a 7b llava model with huggingface but I canât really find any docs to use it? Any help would be great
I deployed a model to SageMaker with the SageMaker deployment card HF provides. Currently this model: hxxps://huggingface.co/liuhaotian/llava-llama-2-13b-chat-lightning-preview/discussions/3
However one of my concerns is that the card states 'HF_TASK': 'text-generation'
whereas Llava Llama is rather a text to image / image âquestion-answeringâ type of model.
This topic states transformers need tinkering: hxxps://discuss.huggingface.co/t/can-text-to-image-models-be-deployed-to-a-sagemaker-endpoint/20120
So I still havenât got it working. Plus I didnât have enough quota on AWS to deploy it in a half decent box with GPU so itâll be another question if the box can carry its weight at all. Iâm surprised noone helped so far to me neither in HF model discussions, GitHub discussions (hxxps://github.com/haotian-liu/LLaVA/discussions/454) or other forums.
This HuggingFace discussion says hxxps://discuss.huggingface.co/t/can-text-to-image-models-be-deployed-to-a-sagemaker-endpoint/20120 that an inference.py need to be created. I donât know what the Llava Llama has though. I tried to look at the files of the model, but I donât see relevant meta data about this.
This StackOverflow entry hxxps://stackoverflow.com/questions/76197446/how-to-do-model-inference-on-a-multimodal-model-from-hugginface-using-sagemaker is about a serverless deployment case, but it uses a custom TextImageSerializer serializer. Shoudl I try to use something like that?
My Stackoverflow entry: hxxps://stackoverflow.com/questions/77193088/how-to-perform-an-inference-on-a-llava-llama-model-deployed-to-sagemake-from-hug
Reddit: hxxps://www.reddit.com/r/LocalLLaMA/comments/16pzn88/how_to_parametrize_a_llava_llama_model/
Check out the following blog post:
It uses HuggingFace Transformers Llava and Runhouse
For AWS SageMaker, you can check out this one: hxxps://www.run.house/blog/quickest-aws-sagemaker-deployment
Hi,
LLaVa and BakLLaVa are now supported natively in the Transformers library
Docs: LLaVa
Checkpoints are on the hub: llava-hf (Llava Hugging Face).
This topic was automatically closed 12 hours after the last reply. New replies are no longer allowed.