How to see BERT,BART... output dimensions?

knightiran · June 2, 2021, 6:40am

how can i see the output dimensions for BERT Large,BART Large,RoBERTa Large, BART Large CNN, XLM RoBERTa Large?

knightiran · June 3, 2021, 6:12am

really no one knows?!

nielsr · June 4, 2021, 9:13am

The output dimensions can be derived from the documentation of the respective models.

For example, BERT-large outputs hidden_states of shape (batch_size, sequence_len, hidden_size) as can be seen in the documentation of BertModel (see last_hidden_state under “Returns” of the forward method). You can check what the hidden_size is of BERT-large by checking it’s configuration, like so:

from transformers import BertConfig

config = BertConfig.from_pretrained("bert-large-uncased")
print(config.hidden_size)

This prints 1024. Note that you can also directly read the various attributes (among which the hidden_size) on the hub.

So if you send a batch of sentences through BERT (for example a batch size of 4), and the sentences are padded up to a sequence length of 512 tokens, then the output of BERT-large will be of shape (4, 512, 1024).

RoBERTa (and XLM-RoBERTa), which are based on BERT, follow the same dimensions, so the hidden_size will also be 1024.

BART on the other hand is an encoder-decoder model (whereas BERT, RoBERTa and XLM-RoBERTa are encoder-only models). It also returns hidden_states of shape (batch_size, sequence_len, hidden_size). Looking at the config on the hub of BART-large, it looks like the hidden_size is also 1024 (the hidden_size of BART-large is called d_model, but these mean the same thing).

Topic		Replies	Views
What is the input vector size for a BERT and Transformer-XL? 🤗Transformers	1	3551	September 2, 2020
Why does the Bert classification head matrix has such dimension? Models	1	18	November 6, 2024
HuggingFace transformers BERT for classification: dimensionality of output with classification layer is expected to be 1, but is 512 instead 🤗Transformers	1	1295	November 14, 2023
Forcing BERT hidden dimension size 🤗Transformers	1	1138	December 19, 2023
Can not understand the sequence length and hidden size of the BEiT model 🤗Transformers	0	226	July 27, 2023

How to see BERT,BART... output dimensions?

Related topics