Hi, I want to pass CLIP image embeddings (1x768 or 257x768) to BLIP-2 to generate captions and I’m wondering if this can be done through diffusers or other means.
Any help would be greatly appreciated.
Hi, I want to pass CLIP image embeddings (1x768 or 257x768) to BLIP-2 to generate captions and I’m wondering if this can be done through diffusers or other means.
Any help would be greatly appreciated.
I found this on reddit:
https://www.reddit.com/r/comfyui/comments/15vsj72/is_there_a_way_to_do_clip_interrogation_on_an/