KeyError: 'loss' while training QnA

fenil · March 3, 2021, 7:46am

I was finetuning BertForQuestionAnswering on nlp squad dateset with the following arguments

training_args = TrainingArguments(
    "test-qa-squad",
    learning_rate=2e-5,
    weight_decay=0.01,
    label_names = ["start_positions", "end_positions"],
    num_train_epochs=5,
    load_best_model_at_end=True,
    evaluation_strategy='epoch'
)

trainer = Trainer(
    model=model,
    args=training_args,
    train_dataset=train_dl,
    eval_dataset=train_dl
)

Then doing trainer.train() trains for some batches but then after a specific batch throws this error (one epoch isn’t complete yet)

KeyError                                  Traceback (most recent call last)

<ipython-input-19-3435b262f1ae> in <module>()
----> 1 trainer.train()

3 frames

/usr/local/lib/python3.7/dist-packages/transformers/file_utils.py in __getitem__(self, k)
   1444         if isinstance(k, str):
   1445             inner_dict = {k: v for (k, v) in self.items()}
-> 1446             return inner_dict[k]
   1447         else:
   1448             return self.to_tuple()[k]

KeyError: 'loss'

Is this some issue in the dataset? Any help is much appreciated

sgugger · March 3, 2021, 1:54pm

You should double check your datasets has items that are dictionaries with the keys "start_positions", "end_positions" (that may be why the model is not returning the loss).

Also, you seem to be passing dataloaders to the Trainer? It takes datasets.

Lastly, for easy debug you can do the following:

for batch in trainer.get_train_dataloader():
    break
batch = {k: v.cuda() for k, v in batch.items()}
outputs = trainer.model(**batch)

to easily inspect what’s in your batch and your outputs.

mjc00 · March 17, 2022, 7:30pm

@sgugger I am running into a similar problem KeyError: 'loss' my dataset does have the items as dictionaries (see image)

and my code is as follows:

from transformers import Trainer, TrainingArguments
batch_size = 64
logging_steps = len(dataset["train"]) // batch_size
model_name = f"{model_ckpt}-finetuned-test"
training_args = TrainingArguments(output_dir=model_name,
                                  num_train_epochs=2,
                                  learning_rate=2e-5,
                                  per_device_train_batch_size=batch_size,
                                  per_device_eval_batch_size=batch_size,
                                  weight_decay=0.01,
                                  evaluation_strategy="epoch",
                                  disable_tqdm=False,
                                  logging_steps=logging_steps,
                                  label_names = ['CategoryCode'],
                                  #push_to_hub=True, 
                                  log_level="error")


trainer = Trainer(model=model, 
                  args=training_args, 
                  compute_metrics=compute_metrics,
                  train_dataset=dataset["train"],
                  eval_dataset=dataset["vald"],
                  tokenizer=tokenizer)
trainer.train();

Note: I am running the above mentioned code locally Mac M1.

Topic		Replies	Views
KeyError: 'loss' during Fine Tuning bert-base-italian-cased for QA Beginners	3	1321	June 8, 2021
Why am I getting KeyError: 'loss'? Beginners	9	16465	March 17, 2023
`KeyError: 'eval_loss'` when using Trainer with BertForQA 🤗Transformers	7	7341	September 14, 2022
Troubleshoot KeyError: loss Beginners	3	321	January 12, 2023
KeyError: 'loss' when fine-tuning a Transformer model Beginners	7	2463	July 12, 2022

KeyError: 'loss' while training QnA

Related topics