Transfer learning

Hello Everyone, I have a quick question. I trained a Mask Language Model starting from “AutoModelForMaskedLM.from_pretrained” (like a continual pre-training). Then I save that model and I want to use it as starting point for a text classification task, then I use AutoModel.from_pretrained to load the model architecture and finally I use a load_from_ckpt function to copy the weights from the pre-trained model into my new instantiated model. However, I see that the naming of the layers is slightly different from the two models, one has the prefix “bert” and the other does not. This cause a conflict when loading the weights since I check the names of the layers. What is the best way to address this situation?. What I am doing now is to use AutoModelForMaskedLM.from_pretrained also to load the architecture of the fine-tuning, even though i am not training for MLM.
Many thanks!

1 Like