I’m feeding a model (AutoModel.from_pretrained “distilbert-base-uncased”) with a batch of 64 samples, and its returns me a transformers.trainer_utils.PredictionOutput object.
How could I extract the embeddings for these 64 rows?
I’m using:
with torch.no_grad():
for chunk in tqdm(np.array_split(test_df, INFERENCE_BATCH_SIZE)):
test_dataset=Dataset.from_pandas(chunk.loc[:,[COL_TEXT]])
tokenized_test_dataset = test_dataset.map(tokenize_function, batched=True)
tokenized_test_dataset=tokenized_test_dataset.remove_columns([x for x in tokenized_test_dataset.features.keys() if x not in ['input_ids', 'attention_mask']])
test_result = trainer.predict(test_dataset=tokenized_test_dataset)
...
test_result.__class__
transformers.trainer_utils.PredictionOutput
test_result.predictions.shape
(64, 133, 768)
I expected a output size of (64,768) having a context vector for each imput rows.
What this second dimension with a length of 133?
Are these the input multiplied by the attention layers?
Should I simply average them?
Thanks in advance.
Cheers,