Embeddings from the Decoder only model

I am trying to extract the embedding from a decoder only LLM. I tried using hidden states by appending EOS token to input and passing it to model. But embeddings taken by the hidden state values of EOS in last_hidden_layer or concatenation of all tokens hidden state values from last_hidden_layer aren’t performing well using cosine similarity of different prompts.

Is there any way to extract the embedding from decoder only model in order to compare the different prompts.

1 Like

hi @manojkumar427

But embeddings taken by the hidden state values of EOS in last_hidden_layer or concatenation of all tokens hidden state values from last_hidden_layer aren’t performing well using cosine similarity of different prompts.

Can you please give a specific example: what is the output and what do you expect?

Stack Overflow post shows several ways to do it.

Hi @mahmutc ,
Thanks for your intereset,
I am trying to generate vector representation of prompt using decoder only model, so input would be the prompt/sentence and output would be the vector representing prompt/sentence. We can use these vectors for comparing with other prompt/sentence. It would be a great work since decoder only models are rapidly evolving.

Hi @MattiLinnanvuori ,
Thanks for the result, I tried both of them, first method(wgt avg pooling has an issue of length: resultant vector depends on length of prompt) and second method also didn’t perform well , I think paper has proper explanation. Working over it.
Thanks!
Great help.