I’m currently trying to see if BERT performances get better on a binary classification task, after I did a fine-tuning on another task (regression).
My pipeline is:
- bert base → finetune on regression → fine tune on classification → test
- bert base → finetune on classification → test
For all the steps I’m using run_glue_no_trainer.py (with little modifications) found in transformers/examples/pytorch/text-classification.
I have a problem with loading the finetuned-with-regression-BERT weights in AutoModelForSequenceClassification. Looking on github issues I found the parameter “ignore_mismatched_sizes=True” that allows me to load weights without error, but then I get this runtime error when training:
huggingface RuntimeError: The size of tensor a (2) must match the size of tensor b (8) at non-singleton dimension 1
I assume that the problem is the parameter ignore_mismatched_sizes is for going from a classification head to another classification head with different classes, and so it loads the weights anyway (but the regression output for a sample is of size 1, and for binary classification is of size 2).
Given the fact I’m not interested in keeping my regression head, how can I take only “BERT” weights of my finetuned model on regression?
config = BertConfig.from_pretrained(BERT_BASE)
model = BertModel.from_pretrained(REGRESSION-FINE-TUNED-BERT, config=config)
Using this model for initializing BertForSequenceClassification I don’t get errors but I get this warning I wasn’t expecting since I should have removed the regression/classification head:
Some weights of BertForSequenceClassification were not initialized from the model checkpoint at /home/irene/similarity_eval/preprocessing/bert_similarity and are newly initialized: [‘classifier.weight’, ‘classifier.bias’]
Thanks in advance!!