Why does ignore_mismatched_sizes increase the number of TfAlbertMainLayer parameters?

stinkeroni · October 20, 2021, 6:25am

If I load the model from pretrained without much in the way of configs, I get about 11 million parameters in the albert main layer. If I load it but change the problem type and set it to ignore mismatched sizes, the main layer has 222 million parameters. This seems strange to me, I thought changing the problem type would only effect the classifier?

model = TFAlbertForSequenceClassification.from_pretrained('albert-base-v2', config=AlbertConfig(problem_type="single_label_classification"), ignore_mismatched_sizes=True)

Rocketknight1 · October 20, 2021, 12:11pm

Hi! The problem here is that your config object has default layer numbers and sizes that are totally different from the ones in albert-base-v2. If you’d like to train a sequence classification model on top of Albert, you can just do:

model = TFAlbertForSequenceClassification.from_pretrained('albert-base-v2', num_labels=2)

Alternatively, if you want to use a config object, you should initialize it from albert-base-v2 like this:

model = TFAlbertForSequenceClassification.from_pretrained('albert-base-v2', config=AlbertConfig.from_pretrained('albert-base-v2', problem_type="single_label_classification"))

Topic		Replies	Views
Is TFAlbert model pre-trainable? 🤗Transformers	1	325	July 15, 2020
Trainer error for "albert-base-v2" due to batch size mismatch 🤗Transformers	1	742	April 11, 2023
Forcing BERT hidden dimension size 🤗Transformers	1	1134	December 19, 2023
Num_labels creates an error for some models 🤗Transformers	2	822	August 3, 2021
Worse performance when training with albert-large Beginners	0	255	April 16, 2021

Why does ignore_mismatched_sizes increase the number of TfAlbertMainLayer parameters?

Related topics