Error when fine-tuning with the Trainer API

Lenn · December 9, 2021, 11:06am

Hi,

I’m fine-tuning a model with the Trainer API and following these instructions: https://huggingface.co/docs/transformers/training#finetuning-in-pytorch-with-the-trainer-api

However, after I have defined the compute_metrics function and tried to run the script, it gave me the following error:

Traceback (most recent call last):
  File "/home/le/torch_tutorial/lm_1_perpl.py", line 77, in <module>
    trainer.train()
  File "/home/le/torch_tutorial/venv/lib/python3.9/site-packages/transformers/trainer.py", line 1391, in train
    self._maybe_log_save_evaluate(tr_loss, model, trial, epoch, ignore_keys_for_eval)
  File "/home/le/torch_tutorial/venv/lib/python3.9/site-packages/transformers/trainer.py", line 1491, in _maybe_log_save_evaluate
    metrics = self.evaluate(ignore_keys=ignore_keys_for_eval)
  File "/home/le/torch_tutorial/venv/lib/python3.9/site-packages/transformers/trainer.py", line 2113, in evaluate
    output = eval_loop(
  File "/home/le/torch_tutorial/venv/lib/python3.9/site-packages/transformers/trainer.py", line 2354, in evaluation_loop
    metrics = self.compute_metrics(EvalPrediction(predictions=all_preds, label_ids=all_labels))
  File "/home/le/torch_tutorial/lm_1_perpl.py", line 67, in compute_metrics
    return metric.compute(predictions=predictions, references=labels)
  File "/home/le/torch_tutorial/venv/lib/python3.9/site-packages/datasets/metric.py", line 393, in compute
    self.add_batch(predictions=predictions, references=references)
  File "/home/le/torch_tutorial/venv/lib/python3.9/site-packages/datasets/metric.py", line 434, in add_batch
    batch = self.info.features.encode_batch(batch)
  File "/home/le/torch_tutorial/venv/lib/python3.9/site-packages/datasets/features/features.py", line 1049, in encode_batch
    encoded_batch[key] = [encode_nested_example(self[key], obj) for obj in column]
  File "/home/le/torch_tutorial/venv/lib/python3.9/site-packages/datasets/features/features.py", line 1049, in <listcomp>
    encoded_batch[key] = [encode_nested_example(self[key], obj) for obj in column]
  File "/home/le/torch_tutorial/venv/lib/python3.9/site-packages/datasets/features/features.py", line 853, in encode_nested_example
    return schema.encode_example(obj)
  File "/home/le/torch_tutorial/venv/lib/python3.9/site-packages/datasets/features/features.py", line 297, in encode_example
    return int(value)
TypeError: only size-1 arrays can be converted to Python scalars

Do you have any ideas on what can be causing it? I have not changed anything in my code except for adding the compute_metrics function (like in the tutorial) and adding the compute_metrics argument in Trainer (before this addition everything was working perfectly):

def compute_metrics(eval_pred):
    logits, labels = eval_pred
    predictions = np.argmax(logits, axis=-1)
    return metric.compute(predictions=predictions, references=labels)

trainer = Trainer(
    model=model,
    args=training_args,
    train_dataset=lm_datasets["train"],
    eval_dataset=lm_datasets["validation"],
    compute_metrics=compute_metrics,
)

Topic		Replies	Views
Error in Question answering comput_metrics 🤗Transformers	0	49	November 21, 2024
Evaluation became slower and slower during Trainer.train() Beginners	8	4666	February 3, 2025
Warning when adding compute_metrics function to Trainer 🤗Transformers	9	4823	March 3, 2021
Compute_metrics caused training stopped during evalauation 🤗Transformers	0	373	November 16, 2022
KeyError: 'eval_accuracy' when running trainer Beginners	10	4199	October 8, 2023

Error when fine-tuning with the Trainer API

Related topics