I am trying to extract the embedding from a decoder only LLM. I tried using hidden states by appending EOS token to input and passing it to model. But embeddings taken by the hidden state values of EOS in last_hidden_layer or concatenation of all tokens hidden state values from last_hidden_layer aren’t performing well using cosine similarity of different prompts.
Is there any way to extract the embedding from decoder only model in order to compare the different prompts.
But embeddings taken by the hidden state values of EOS in last_hidden_layer or concatenation of all tokens hidden state values from last_hidden_layer aren’t performing well using cosine similarity of different prompts.
Can you please give a specific example: what is the output and what do you expect?
Hi @mahmutc ,
Thanks for your intereset,
I am trying to generate vector representation of prompt using decoder only model, so input would be the prompt/sentence and output would be the vector representing prompt/sentence. We can use these vectors for comparing with other prompt/sentence. It would be a great work since decoder only models are rapidly evolving.
Hi @MattiLinnanvuori ,
Thanks for the result, I tried both of them, first method(wgt avg pooling has an issue of length: resultant vector depends on length of prompt) and second method also didn’t perform well , I think paper has proper explanation. Working over it.
Thanks!
Great help.