Calculating accuracy during fine-tuning the BERTForMaskedLM

nes · September 30, 2020, 9:28am

Hello,

While fine-tuning, we can only see loss and perplexity which is useful.
Is it also possible to see the accuracy of the model and also the tensorboard when using the “run_language_modeling.py” script? It would be really helpful if anyone could explain how the “loss” is calculated for BERTForMaskedLM task (as there are no labels provided while fine-tuning).

vblagoje · September 30, 2020, 11:24am

To replicate the original training loss from the paper it should be calculated as “The training loss is the sum of the mean masked LM likelihood and the mean next sentence prediction likelihood.” You have more details in the BERT paper.

nes · September 30, 2020, 11:30am

Thanks for your reply @vblagoje. Is it also possible to provide labels during fine-tuning of BERTForMaskedLM task? I was following this example.

vblagoje · September 30, 2020, 12:05pm

I suspect you might be mixing up notions of BERT pre-training and BERT fine-tuning. BERT pre-training is used to train the BERT model itself which is then used for downstream tasks (that’s where it is fine-tuned). Very few people (researchers) are doing BERT pre-training and developers mostly use pre-trained models available on HF hub for their particular tasks. This is where the labels likely become relevant. In BERT pre-training there are no labels, it’s an unsupervised training task.

nes · September 30, 2020, 2:06pm

I am actually working on “spelling correction” task. For this task I have pre-trained the BERT model using masked language model. After pre-training, I want to fine-tune the model. We know that the dataset for spelling correction usually contains incorrect and its correct version in the file. So, how can I give the dataset which contains incorrect as well as correct versions while fine-tuning (how to give the labels? I am not understanding that part)? It would be grateful, if you help me in this regard.

vblagoje · September 30, 2020, 2:15pm

Aha, I get it. For spelling correction task you likely need to start from token classification examples and take it from there.

nes · October 1, 2020, 10:35am

@vblagoje Thank you for your reply.

Here is how I am dealing with task currently:

I am using BERT by masking the misspelled word to get predictions with their probability score. However, the results are not so good. So, I thought of fine-tuning the BERT.

I have checked the examples on token classification but I am not sure how the token classification will help me for my task. Could you please elaborate a bit more?

Topic		Replies	Views
Continue pre-training BERT Intermediate	5	2461	November 13, 2023
Is masking still used when finetuning a BERT model? Beginners	1	1322	July 29, 2020
Getting the MLM accuracy for the BERT model I am training from scratch Beginners	7	5356	October 5, 2023
BertForMaskedLM model require fine-tuning? Beginners	0	644	August 7, 2022
Couple of questions about Trainer Beginners	0	329	June 13, 2023

Calculating accuracy during fine-tuning the BERTForMaskedLM

Related topics