Example for Fine Tuning CLIP or BLIP2 for VQA

Hi,
I wanted to fine tune CLIP and BLIP2 for a VQA task on custom dataset, but I was unsure how to do it. Are there any examples for fine tuning CLIP and BLIP2 for VQA?

Thank you