Hello,
I’m trying to extract an embedding for several sentences from llama2-7b, and I see that I am getting the same embedding vector for the class token from the last hidden layer, no matter what the input is.
from transformers import LlamaTokenizer, LlamaModel
llama2_tokenizer = LlamaTokenizer.from_pretrained("meta-llama/Llama-2-7b-hf")
llama2_model = LlamaModel.from_pretrained("meta-llama/Llama-2-7b-hf")
inputs = llama2_tokenizer("put any sentence here", return_tensors="pt")
outputs = llama2_model(**inputs, return_dict=True)
print(outputs.last_hidden_state[0,0])
The class token is the first token, hence I index from [0,0] for a batch of size 1.
Would appreciate your thoughts on this. Thanks!