Please help me to fine tune image captioning. I want to fine tune CLIP, VIT and BLIp. If any other models are there please help to get.
Related topics
Topic | Replies | Views | Activity | |
---|---|---|---|---|
Fine-tuning CLIP questions | 1 | 533 | May 21, 2024 | |
Salesforce/blip-image-captioning-large not working | 0 | 8 | May 8, 2025 | |
How to pass CLIP image embeddings to BLIP2 for captioning? | 1 | 1056 | November 15, 2023 | |
What would be the best image-to-text model for a lot of images? | 0 | 878 | November 8, 2023 | |
Image to Text model that can take an additional text as input for context | 1 | 477 | September 5, 2023 |