I'm having trouble with my Hugginface Trainer


I’m using Huggingface Transformers to create an NLP model. I’m having issues during the training of this model, where an error is thrown. The error is thrown during the validation stage of the first epoch.

Initially I had an issue with my metric function. After fixing it (I think), a new TypeError is thrown:

TypeError: 'float' object does not support item assignment

The traceback is contained in the dropdown below.



TypeError                                 Traceback (most recent call last)
/tmp/ipykernel_24/4032920361.py in <module>
----> 1 trainer.train()

/opt/conda/lib/python3.7/site-packages/transformers/trainer.py in train(self, resume_from_checkpoint, trial, ignore_keys_for_eval, **kwargs)
   1411             resume_from_checkpoint=resume_from_checkpoint,
   1412             trial=trial,
-> 1413             ignore_keys_for_eval=ignore_keys_for_eval,
   1414         )

/opt/conda/lib/python3.7/site-packages/transformers/trainer.py in _inner_training_loop(self, batch_size, args, resume_from_checkpoint, trial, ignore_keys_for_eval)
   1742             self.control = self.callback_handler.on_epoch_end(args, self.state, self.control)
-> 1743             self._maybe_log_save_evaluate(tr_loss, model, trial, epoch, ignore_keys_for_eval)
   1745             if DebugOption.TPU_METRICS_DEBUG in self.args.debug:

/opt/conda/lib/python3.7/site-packages/transformers/trainer.py in _maybe_log_save_evaluate(self, tr_loss, model, trial, epoch, ignore_keys_for_eval)
   1910         metrics = None
   1911         if self.control.should_evaluate:
-> 1912             metrics = self.evaluate(ignore_keys=ignore_keys_for_eval)
   1913             self._report_to_hp_search(trial, epoch, metrics)

/opt/conda/lib/python3.7/site-packages/transformers/trainer.py in evaluate(self, eval_dataset, ignore_keys, metric_key_prefix)
   2626             prediction_loss_only=True if self.compute_metrics is None else None,
   2627             ignore_keys=ignore_keys,
-> 2628             metric_key_prefix=metric_key_prefix,
   2629         )

/opt/conda/lib/python3.7/site-packages/transformers/trainer.py in evaluation_loop(self, dataloader, description, prediction_loss_only, ignore_keys, metric_key_prefix)
   2910         if all_losses is not None:
-> 2911             metrics[f"{metric_key_prefix}_loss"] = all_losses.mean().item()
   2913         # Prefix all keys with metric_key_prefix + '_'

TypeError: 'float' object does not support item assignment

I think the error is occuring after my accuracy is computed.

My metric function, which computes RMSE, is shown below.

def rmse(valid_pred):
    preds = valid_pred.predictions
    targs = valid_pred.label_ids
    return torch.nn.functional.mse_loss(torch.from_numpy(preds), torch.from_numpy(targs)).sqrt()

valid_pred is the data structure in which the evaluation predictions of my trainer are stored.

The metric function above returns a PyTorch tensor of dimension 0 (e.g., an example output was tensor(0.5261).

I would appreciate any input into figuring out why this error is being thrown! If you need to know any more information to do so, please do let me know.