I would like to add a new LLM to a Blip2 model. Different from the already pre-trained ones, like Vicuma, OPT or FlanT5.
For that, I’m loading the Blip2 model one piece at a time. So I’m loading the Vision model first then the Q Former, and finally, I would like to load the LLM. I followed the instructions here, but now I’m at a loss I don’t know how to connect these 3 parts or what inputs go into each other. If any one has any idea it would be much appreciated.
Code to load the model from hugging face
from transformers import Blip2VisionConfig, Blip2VisionModel # Initializing a Blip2VisionConfig with Salesforce/blip2-opt-2.7b style configuration vit_configuration = Blip2VisionConfig() # Initializing a Blip2VisionModel (with random weights) vit_model = Blip2VisionModel(vit_configuration) # Accessing the model configuration vit_configuration_f = vit_model.config
Code for loading the Q Former
from transformers import Blip2QFormerConfig, Blip2QFormerModel # Initializing a BLIP-2 Salesforce/blip2-opt-2.7b style configuration q_configuration = Blip2QFormerConfig() # Initializing a model (with random weights) from the Salesforce/blip2-opt-2.7b style configuration q_model = Blip2QFormerModel(q_configuration) # Accessing the model configuration q_configuration_f = q_model.config