Error in fine-tuning BERT

Yes, you’re right that the problem is happening on the trainer.evaluate() step. It might be coming from the label_names argument in your TrainingArguments. From the docs we have:

The list of keys in your dictionary of inputs that correspond to the labels.

Will eventually default to ["labels"] except if the model used is one of the XxxForQuestionAnswering in which case it will default to ["start_positions", "end_positions"] .

So it seems you need to provide a list like ['label'] instead of the string. If that doesn’t work, you could try renaming the “label” column in your CSV files to “labels” and then dropping the label_names argument from TrainingArguments.

You can then check if it works by just running

trainer.evaluate()

which is faster than waiting for one epoch of training :slight_smile:

As a tip, I would also specify all the implicit arguments of your TrainingArguments and Trainer explicitly, e.g. use ouput_dir="test_20210201_1200" in TrainingArguments and similarly for model and args in Trainer.

PS. one thing that looks a bit odd is the way you load the metric:

metric = load_metric('f1', 'accuracy')

I don’t think you can load multiple metrics this way since the second argument refers to the “configuration” of the metric (e.g. GLUE has a config for each task). Nevertheless, this is probably not the source of the problem.

7 Likes