How to see BERT,BART... output dimensions?

how can i see the output dimensions for BERT Large,BART Large,RoBERTa Large, BART Large CNN, XLM RoBERTa Large?

really no one knows?!

The output dimensions can be derived from the documentation of the respective models.

For example, BERT-large outputs hidden_states of shape (batch_size, sequence_len, hidden_size) as can be seen in the documentation of BertModel (see last_hidden_state under “Returns” of the forward method). You can check what the hidden_size is of BERT-large by checking it’s configuration, like so:

from transformers import BertConfig

config = BertConfig.from_pretrained("bert-large-uncased")
print(config.hidden_size)

This prints 1024. Note that you can also directly read the various attributes (among which the hidden_size) on the hub.

So if you send a batch of sentences through BERT (for example a batch size of 4), and the sentences are padded up to a sequence length of 512 tokens, then the output of BERT-large will be of shape (4, 512, 1024).

RoBERTa (and XLM-RoBERTa), which are based on BERT, follow the same dimensions, so the hidden_size will also be 1024.

BART on the other hand is an encoder-decoder model (whereas BERT, RoBERTa and XLM-RoBERTa are encoder-only models). It also returns hidden_states of shape (batch_size, sequence_len, hidden_size). Looking at the config on the hub of BART-large, it looks like the hidden_size is also 1024 (the hidden_size of BART-large is called d_model, but these mean the same thing).