How to pass CLIP image embeddings to BLIP2 for captioning?

Hi, I want to pass CLIP image embeddings (1x768 or 257x768) to BLIP-2 to generate captions and I’m wondering if this can be done through diffusers or other means.

Any help would be greatly appreciated.

I found this on reddit:
https://www.reddit.com/r/comfyui/comments/15vsj72/is_there_a_way_to_do_clip_interrogation_on_an/