Continue LM pretraining with run_mlm - loss function clarification

dalia · March 14, 2022, 6:40pm

I’m trying to use Huggingface’s tensorflow run_mlm.py script to continue pretraining a bert model, and didn’t understand the following: in the above script, the model is loaded using from_pretrained and then compiled with a dummy_loss function before running model.fit(…). The dummy_loss function defined in the script ignores y_true and simply returns the mean of y_pred. Is this loss function overridden somehow? I can’t understand how the script actually continues pretraining if this is the only loss function used.

Topic		Replies	Views
Bert LM pretraining: training loss goes to 0 at masking probability of 0.999 Beginners	2	2320	October 31, 2020
Train bert from scratch using run_mlm.py Beginners	0	804	March 25, 2022
Pre-Train BERT from scratch 🤗Transformers	5	15452	May 30, 2023
MLM train loss is very different after version update 🤗Transformers	1	438	August 29, 2021
BertForMaskedLM model require fine-tuning? Beginners	0	645	August 7, 2022

Continue LM pretraining with run_mlm - loss function clarification

Related topics