Forcing BERT hidden dimension size

rcmcabral · December 14, 2023, 9:29am

Hi. I am doing a parameter study to investigate the effect of different hidden dimension sizes of pretrained models. I’ve successfully used the code below for roberta and mentalbert but can’t seem to get ignore_mismatched_sizes to work for bert-base-uncased. Despite having it already on the code, the RuntimeError I receive still says

You may consider adding ignore_mismatched_sizes=True in the model from_pretrained method.

I have replicated the issue using Colab with this code:

from transformers import AutoConfig, AutoModelForSequenceClassification, AutoModel

# checkpoint = "bert-base-uncased"   #Throws RuntimeError
checkpoint = "roberta-base"

num_class = 3
args = {"hidden_size": 48}

config = AutoConfig.from_pretrained(checkpoint, num_labels = num_class, **args)
model = AutoModelForSequenceClassification.from_pretrained(checkpoint, config = config, ignore_mismatched_sizes = True)

To add, using it for roberta just throws a lot of warnings but returns the model with forced hidden dimesions nonetheless. Bert on the other hand throws the error above.

I’m quite confused how it works for the other models but not for Bert. Any insight on how to force Bert’s hidden dimension size is greatly appreciated.

enochlev · December 19, 2023, 7:22pm

I have never heard of this dimension reduction at the hidden_layer
BERT outputs 768 output logits and thats all you get… no more and no less.

I know of 3 possible way around this though

add an additional layer with input of 768 and output size with the size you want lets say [32,64,128,256,512] and fine-tune all versions of the models on one same dataset. Do it multiple times and get the average.
Randomly pick N amount of vectors to keep. Do shuffle this for all size of model multiple times.
Use a dimension reduction algorithm such as PCA as the last layer in a pipe. Again do it multiple times.

Topic		Replies	Views
How do i take only "BERT" weights from BertForSequenceClassification model? 🤗Transformers	0	1444	February 16, 2022
Saving Manually Resized Embeddings for a Pretrained Bert Model (I believe I am asking this correctly) Beginners	0	106	November 7, 2024
How to see BERT,BART... output dimensions? Beginners	2	5947	June 4, 2021
HuggingFace transformers BERT for classification: dimensionality of output with classification layer is expected to be 1, but is 512 instead 🤗Transformers	1	1295	November 14, 2023
Trainer error for "albert-base-v2" due to batch size mismatch 🤗Transformers	1	742	April 11, 2023

Forcing BERT hidden dimension size

Related topics