Hello,
I am trying to understand the transformers architecture better and in particular to extract the contextual embeddings for a given sentence.
I know I can use the pipeline
feature-extraction but I would like to extract them manually, but consider the small example below. Unfortunately, the last hidden states cannot be the contextual embeddings : I get a 2-dimensional vector whereas the embeddings have hundreds of dimensions.
import tensorflow as tf
from transformers import AutoTokenizer, TFAutoModelForSequenceClassification
tokenizer = AutoTokenizer.from_pretrained('distilbert-base-uncased-finetuned-sst-2-english')
model = TFAutoModelForSequenceClassification.from_pretrained('distilbert-base-uncased-finetuned-sst-2-english')
input_ids = tf.constant(tokenizer.encode("Hello I am a dog."))[None, :]
outputs = model(input_ids)
last_hidden_states = outputs[0]
last_hidden_states.numpy()
Out[22]: array([[-1.651872 , 1.6822953]], dtype=float32)
What is the issue here? Thanks!