How to correctly evaluate a Masked Language Model?

jonathanalis · February 13, 2022, 2:10am

In RoBerta they use accuracy and f1 scores of the language model. O got this code that I think performs the accuracy:

import sklearn
from datasets import load_metric
import numpy as np
metric = load_metric(“accuracy”)

def compute_metrics(eval_pred):
logits, labels = eval_pred
predictions = np.argmax(logits, axis=-1)

indices = [[i for i, x in enumerate(labels[row]) if x != -100] for row in range(len(labels))]

labels = [labels[row][indices[row]] for row in range(len(labels))]
labels = [item for sublist in labels for item in sublist]

predictions = [predictions[row][indices[row]] for row in range(len(predictions))]
predictions = [item for sublist in predictions for item in sublist]

results = metric.compute(predictions=predictions, references=labels)
results["eval_accuracy"] = results["accuracy"]
print(results)
results.pop("accuracy")

return results

Then create a trainer and use this as the compute_metric parameter

from transformers import Trainer
trainer = Trainer(
model=model,
…
compute_metrics=compute_metrics,
)

Then

results = trainer.evaluate()
accuracy = results[‘eval_results’]

Topic		Replies	Views
Accuracy of Masked LM training Beginners	0	1028	June 15, 2022
Metrics for masked language modeling (mlm) Beginners	0	499	September 16, 2021
Getting the MLM accuracy for the BERT model I am training from scratch Beginners	7	5354	October 5, 2023
Accuracy of MLM model 🤗Transformers	5	1520	July 13, 2021
Evaluation metrics for BERT-like LMs Research	4	4612	December 6, 2024

How to correctly evaluate a Masked Language Model?

Related topics