Why does the Bert classification head matrix has such dimension?

EquinoxElahin · November 5, 2024, 10:54am

Hi everyone,
I downloaded the NLI Bert model : morit/french_xlm_xnli

Looking into the classification head. Here is the str representation :

XLMRobertaClassificationHead(
(dense): Linear(in_features=768, out_features=768, bias=True)
(dropout): Dropout(p=0.1, inplace=False)
(out_proj): Linear(in_features=768, out_features=3, bias=True)
)

OK seems nice to me, bert’s internal dimension 768 and I m looking for 3 out features because of NLI (-1,0, 1 basically).
Exporting the out_proj matrices I was expecting its shape being (768, 3) But I got the transpose shape (3, 768) and this is I really don’t understand.
Can someone enlight me ?

John6666 · November 6, 2024, 11:54am

I realized that I didn’t understand.

Topic		Replies	Views
How to see BERT,BART... output dimensions? Beginners	2	6004	June 4, 2021
What is the input vector size for a BERT and Transformer-XL? 🤗Transformers	1	3563	September 2, 2020
HuggingFace transformers BERT for classification: dimensionality of output with classification layer is expected to be 1, but is 512 instead 🤗Transformers	1	1298	November 14, 2023
Trying to understand XForSequenceClassification heads Intermediate	8	1328	September 24, 2020
How to extract the encoded data of feed & forward layer in TFbertModel Beginners	0	450	August 18, 2021

Why does the Bert classification head matrix has such dimension?

Related topics