Why does the Bert classification head matrix has such dimension?

Hi everyone,
I downloaded the NLI Bert model : morit/french_xlm_xnli

Looking into the classification head. Here is the str representation :

XLMRobertaClassificationHead(
(dense): Linear(in_features=768, out_features=768, bias=True)
(dropout): Dropout(p=0.1, inplace=False)
(out_proj): Linear(in_features=768, out_features=3, bias=True)
)

OK seems nice to me, bert’s internal dimension 768 and I m looking for 3 out features because of NLI (-1,0, 1 basically).
Exporting the out_proj matrices I was expecting its shape being (768, 3) But I got the transpose shape (3, 768) and this is I really don’t understand.
Can someone enlight me ?

1 Like

I realized that I didn’t understand.