LLaVA multi-image input support for inference

alzaia · February 6, 2024, 1:16am

Thanks! This is what I was expecting. I saw the same kind of answers from the authors on their github as well. I guess we will need to wait for LLaVA 2.0 for this (LLaVA 1.6 just came out but I do not think it was trained on multi-image).

Topic		Replies	Views
Multimodal LLM with Image and Text sequentially in its prompt 🤗Transformers	2	12475	January 1, 2024
Turning a LLaMA model into a LLaVA Beginners	0	90	June 24, 2024
Looking information on the training set used in LLaVA Beginners	0	12	July 24, 2024
ValueError: Image features and image tokens do not match 🤗Transformers	2	2182	April 14, 2025
Error making predictions using LMM (LLaVA) model on multiple GPUs Intermediate	0	542	March 27, 2024

LLaVA multi-image input support for inference

Related topics