Hi everyone,
Iâm currently trying to see if BERT performances get better on a binary classification task, after I did a fine-tuning on another task (regression).
My pipeline is:
- bert base â finetune on regression â fine tune on classification â test
vs - bert base â finetune on classification â test
For all the steps Iâm using run_glue_no_trainer.py (with little modifications) found in transformers/examples/pytorch/text-classification.
I have a problem with loading the finetuned-with-regression-BERT weights in AutoModelForSequenceClassification. Looking on github issues I found the parameter âignore_mismatched_sizes=Trueâ that allows me to load weights without error, but then I get this runtime error when training:
huggingface RuntimeError: The size of tensor a (2) must match the size of tensor b (8) at non-singleton dimension 1
I assume that the problem is the parameter ignore_mismatched_sizes is for going from a classification head to another classification head with different classes, and so it loads the weights anyway (but the regression output for a sample is of size 1, and for binary classification is of size 2).
Given the fact Iâm not interested in keeping my regression head, how can I take only âBERTâ weights of my finetuned model on regression?
I tried:
config = BertConfig.from_pretrained(BERT_BASE)
model = BertModel.from_pretrained(REGRESSION-FINE-TUNED-BERT, config=config)
model.save_pretrained(âŚ)
Using this model for initializing BertForSequenceClassification I donât get errors but I get this warning I wasnât expecting since I should have removed the regression/classification head:
Some weights of BertForSequenceClassification were not initialized from the model checkpoint at /home/irene/similarity_eval/preprocessing/bert_similarity and are newly initialized: [âclassifier.weightâ, âclassifier.biasâ]
Any advice?
Thanks in advance!!