Feature extraction output

I’m feeding a model (AutoModel.from_pretrained “distilbert-base-uncased”) with a batch of 64 samples, and its returns me a transformers.trainer_utils.PredictionOutput object.

How could I extract the embeddings for these 64 rows?

I’m using:

with torch.no_grad():

    for chunk in tqdm(np.array_split(test_df, INFERENCE_BATCH_SIZE)):
        test_dataset=Dataset.from_pandas(chunk.loc[:,[COL_TEXT]]) 
        tokenized_test_dataset = test_dataset.map(tokenize_function, batched=True)
        tokenized_test_dataset=tokenized_test_dataset.remove_columns([x for x in tokenized_test_dataset.features.keys() if x not in ['input_ids', 'attention_mask']])
        
        test_result = trainer.predict(test_dataset=tokenized_test_dataset)
        ...

test_result.__class__

transformers.trainer_utils.PredictionOutput

test_result.predictions.shape

(64, 133, 768)

I expected a output size of (64,768) having a context vector for each imput rows.

What this second dimension with a length of 133?

Are these the input multiplied by the attention layers?

Should I simply average them?

Thanks in advance.

Cheers,