Blip2 with a new LLM

inkasaras · August 15, 2023, 6:21pm

Hey,

I would like to add a new LLM to a Blip2 model. Different from the already pre-trained ones, like Vicuma, OPT or FlanT5.

For that, I’m loading the Blip2 model one piece at a time. So I’m loading the Vision model first then the Q Former, and finally, I would like to load the LLM. I followed the instructions here, but now I’m at a loss I don’t know how to connect these 3 parts or what inputs go into each other. If any one has any idea it would be much appreciated.

Code to load the model from hugging face

from transformers import Blip2VisionConfig, Blip2VisionModel

# Initializing a Blip2VisionConfig with Salesforce/blip2-opt-2.7b style configuration
vit_configuration = Blip2VisionConfig()

# Initializing a Blip2VisionModel (with random weights) 
vit_model = Blip2VisionModel(vit_configuration)

# Accessing the model configuration
vit_configuration_f = vit_model.config

Code for loading the Q Former

from transformers import Blip2QFormerConfig, Blip2QFormerModel

# Initializing a BLIP-2 Salesforce/blip2-opt-2.7b style configuration
q_configuration = Blip2QFormerConfig()

# Initializing a model (with random weights) from the Salesforce/blip2-opt-2.7b style configuration
q_model = Blip2QFormerModel(q_configuration)
# Accessing the model configuration
q_configuration_f = q_model.config

Topic		Replies	Views
Blip-2 as a classification model Models	0	138	August 21, 2024
Text classification using BLIP2 Beginners	0	90	August 5, 2024
Example for Fine Tuning CLIP or BLIP2 for VQA Beginners	18	9187	February 20, 2025
BLIP-2 - Should the image + language model be frozen by default? Models	0	406	April 17, 2023
Adapting BLIP2 for zero-shot classification 🤗Transformers	3	1471	August 8, 2024

Blip2 with a new LLM

Related topics