I want to get sentences’ embedding vectors for other classification tasks
tokenizer = BertTokenizer.from_pretrained("bert-base-uncased") model = BertModel.from_pretrained("bert-base-uncased") inputs = tokenizer('this is a test.', return_tensors="pt") outputs = model(**inputs)
If I do this way:
embedding_of_sentence = outputs
Here, according to the documentation, the outputs is the:
* **pooler_output** ( torch.FloatTensor
of shape(batch_size, hidden_size)
) – Last layer hidden-state of the first token of the sequence (classification token) further processed by a Linear layer and a Tanh activation function. The Linear layer weights are trained from the next sentence prediction (classification) objective during pretraining.
The last layer hidden state of the first token CLS of the sentence for classification, which seems right.
However, in another post, they are suggesting using “usually only take the hidden states of the [CLS] token of the last layer”,
and the code is:
embedding_of_last_layer = outputs embedding_of_sentence =embedding_of_last_layer
The results from the two methods for the sentence embedding are different. Which one is right or better?