I can transform a text (prompt) into clip embeddings with:
prompt -> tokenizer -> tokens -> CLIPTextModel.from_pretrained -> embeddings
I would like to decode an embedding to a prompt:
embeddings -> ??? -> tokens -> tokenizer -> prompt
How do I convert CLIP embeddings into tokens?