Error when fine-tuning with the Trainer API

Lenn · December 9, 2021, 11:06am

Hi,

I’m fine-tuning a model with the Trainer API and following these instructions: https://huggingface.co/docs/transformers/training#finetuning-in-pytorch-with-the-trainer-api

However, after I have defined the compute_metrics function and tried to run the script, it gave me the following error:

Traceback (most recent call last):
  File "/home/le/torch_tutorial/lm_1_perpl.py", line 77, in <module>
    trainer.train()
  File "/home/le/torch_tutorial/venv/lib/python3.9/site-packages/transformers/trainer.py", line 1391, in train
    self._maybe_log_save_evaluate(tr_loss, model, trial, epoch, ignore_keys_for_eval)
  File "/home/le/torch_tutorial/venv/lib/python3.9/site-packages/transformers/trainer.py", line 1491, in _maybe_log_save_evaluate
    metrics = self.evaluate(ignore_keys=ignore_keys_for_eval)
  File "/home/le/torch_tutorial/venv/lib/python3.9/site-packages/transformers/trainer.py", line 2113, in evaluate
    output = eval_loop(
  File "/home/le/torch_tutorial/venv/lib/python3.9/site-packages/transformers/trainer.py", line 2354, in evaluation_loop
    metrics = self.compute_metrics(EvalPrediction(predictions=all_preds, label_ids=all_labels))
  File "/home/le/torch_tutorial/lm_1_perpl.py", line 67, in compute_metrics
    return metric.compute(predictions=predictions, references=labels)
  File "/home/le/torch_tutorial/venv/lib/python3.9/site-packages/datasets/metric.py", line 393, in compute
    self.add_batch(predictions=predictions, references=references)
  File "/home/le/torch_tutorial/venv/lib/python3.9/site-packages/datasets/metric.py", line 434, in add_batch
    batch = self.info.features.encode_batch(batch)
  File "/home/le/torch_tutorial/venv/lib/python3.9/site-packages/datasets/features/features.py", line 1049, in encode_batch
    encoded_batch[key] = [encode_nested_example(self[key], obj) for obj in column]
  File "/home/le/torch_tutorial/venv/lib/python3.9/site-packages/datasets/features/features.py", line 1049, in <listcomp>
    encoded_batch[key] = [encode_nested_example(self[key], obj) for obj in column]
  File "/home/le/torch_tutorial/venv/lib/python3.9/site-packages/datasets/features/features.py", line 853, in encode_nested_example
    return schema.encode_example(obj)
  File "/home/le/torch_tutorial/venv/lib/python3.9/site-packages/datasets/features/features.py", line 297, in encode_example
    return int(value)
TypeError: only size-1 arrays can be converted to Python scalars

Do you have any ideas on what can be causing it? I have not changed anything in my code except for adding the compute_metrics function (like in the tutorial) and adding the compute_metrics argument in Trainer (before this addition everything was working perfectly):

def compute_metrics(eval_pred):
    logits, labels = eval_pred
    predictions = np.argmax(logits, axis=-1)
    return metric.compute(predictions=predictions, references=labels)

trainer = Trainer(
    model=model,
    args=training_args,
    train_dataset=lm_datasets["train"],
    eval_dataset=lm_datasets["validation"],
    compute_metrics=compute_metrics,
)

nielsr · December 9, 2021, 1:39pm

What I would do in this case is simply add a print statement within the compute_metrics function to view your logits and labels, their shapes, et.

Lenn · December 9, 2021, 3:12pm

I did print the shapes of the variables inside of compute_metrics but they seem to be fine (at least they have the same shape):

Shape logits:  (148, 128, 50265)
Shape labels: (148, 128)
Shape predictions:  (148, 128)

Or is it something off here?

sgugger · December 9, 2021, 5:27pm

You didn’t tell us which metric you are using so it’s hard to help you without that info. It might expect something flat for instance (if it’s accuracy).

Lenn · December 10, 2021, 8:22am

I was trying to use accuracy indeed but did not know that it expects flat values, so thanks!

Topic		Replies	Views
Error in Question answering comput_metrics 🤗Transformers	0	46	November 21, 2024
Evaluation became slower and slower during Trainer.train() Beginners	8	4642	February 3, 2025
Warning when adding compute_metrics function to Trainer 🤗Transformers	9	4820	March 3, 2021
Compute_metrics caused training stopped during evalauation 🤗Transformers	0	373	November 16, 2022
KeyError: 'eval_accuracy' when running trainer Beginners	10	4180	October 8, 2023

Error when fine-tuning with the Trainer API

Related topics