Getting KeyError: 203 when running trainer.train()

sba-h9 · July 16, 2023, 10:51am

Hello, I am trying to fine-tune a mT5 based model for reading comprehension task in Farsi.
whenever I run trainer.train() cell in google colab, I get the follong error:

╭─────────────────────────────── Traceback (most recent call last) ────────────────────────────────╮
│ /usr/local/lib/python3.10/dist-packages/pandas/core/indexes/base.py:3802 in get_loc │
│ │
│ 3799 │ │ │ │ ) │
│ 3800 │ │ │ casted_key = self._maybe_cast_indexer(key) │
│ 3801 │ │ │ try: │
│ ❱ 3802 │ │ │ │ return self._engine.get_loc(casted_key) │
│ 3803 │ │ │ except KeyError as err: │
│ 3804 │ │ │ │ raise KeyError(key) from err │
│ 3805 │ │ │ except TypeError: │
│ │
│ in pandas._libs.index.IndexEngine.get_loc:138 │
│ │
│ in pandas._libs.index.IndexEngine.get_loc:165 │
│ │
│ in pandas._libs.hashtable.PyObjectHashTable.get_item:5745 │
│ │
│ in pandas._libs.hashtable.PyObjectHashTable.get_item:5753 │
╰──────────────────────────────────────────────────────────────────────────────────────────────────╯
KeyError: 203

The above exception was the direct cause of the following exception:

╭─────────────────────────────── Traceback (most recent call last) ────────────────────────────────╮
│ in <cell line: 1>:1 │
│ │
│ /usr/local/lib/python3.10/dist-packages/transformers/trainer.py:1645 in train │
│ │
│ 1642 │ │ inner_training_loop = find_executable_batch_size( │
│ 1643 │ │ │ self._inner_training_loop, self._train_batch_size, args.auto_find_batch_size │
│ 1644 │ │ ) │
│ ❱ 1645 │ │ return inner_training_loop( │
│ 1646 │ │ │ args=args, │
│ 1647 │ │ │ resume_from_checkpoint=resume_from_checkpoint, │
│ 1648 │ │ │ trial=trial, │
│ │
│ /usr/local/lib/python3.10/dist-packages/transformers/trainer.py:1916 in _inner_training_loop │
│ │
│ 1913 │ │ │ │ rng_to_sync = True │
│ 1914 │ │ │ │
│ 1915 │ │ │ step = -1 │
│ ❱ 1916 │ │ │ for step, inputs in enumerate(epoch_iterator): │
│ 1917 │ │ │ │ total_batched_samples += 1 │
│ 1918 │ │ │ │ if rng_to_sync: │
│ 1919 │ │ │ │ │ self._load_rng_state(resume_from_checkpoint) │
│ │
│ /usr/local/lib/python3.10/dist-packages/torch/utils/data/dataloader.py:633 in next │
│ │
│ 630 │ │ │ if self._sampler_iter is None: │
│ 631 │ │ │ │ # TODO(Bug in dataloader iterator found by mypy · Issue #76750 · pytorch/pytorch · GitHub) │
│ 632 │ │ │ │ self._reset() # type: ignore[call-arg] │
│ ❱ 633 │ │ │ data = self._next_data() │
│ 634 │ │ │ self._num_yielded += 1 │
│ 635 │ │ │ if self._dataset_kind == _DatasetKind.Iterable and \ │
│ 636 │ │ │ │ │ self._IterableDataset_len_called is not None and \ │
│ │
│ /usr/local/lib/python3.10/dist-packages/torch/utils/data/dataloader.py:677 in _next_data │
│ │
│ 674 │ │
│ 675 │ def _next_data(self): │
│ 676 │ │ index = self._next_index() # may raise StopIteration │
│ ❱ 677 │ │ data = self._dataset_fetcher.fetch(index) # may raise StopIteration │
│ 678 │ │ if self._pin_memory: │
│ 679 │ │ │ data = _utils.pin_memory.pin_memory(data, self._pin_memory_device) │
│ 680 │ │ return data │
│ │
│ /usr/local/lib/python3.10/dist-packages/torch/utils/data/_utils/fetch.py:51 in fetch │
│ │
│ 48 │ │ │ if hasattr(self.dataset, “getitems”) and self.dataset.getitems: │
│ 49 │ │ │ │ data = self.dataset.getitems(possibly_batched_index) │
│ 50 │ │ │ else: │
│ ❱ 51 │ │ │ │ data = [self.dataset[idx] for idx in possibly_batched_index] │
│ 52 │ │ else: │
│ 53 │ │ │ data = self.dataset[possibly_batched_index] │
│ 54 │ │ return self.collate_fn(data) │
│ │
│ /usr/local/lib/python3.10/dist-packages/torch/utils/data/_utils/fetch.py:51 in │
│ │
│ 48 │ │ │ if hasattr(self.dataset, “getitems”) and self.dataset.getitems: │
│ 49 │ │ │ │ data = self.dataset.getitems(possibly_batched_index) │
│ 50 │ │ │ else: │
│ ❱ 51 │ │ │ │ data = [self.dataset[idx] for idx in possibly_batched_index] │
│ 52 │ │ else: │
│ 53 │ │ │ data = self.dataset[possibly_batched_index] │
│ 54 │ │ return self.collate_fn(data) │
│ │
│ /usr/local/lib/python3.10/dist-packages/pandas/core/frame.py:3807 in getitem │
│ │
│ 3804 │ │ if is_single_key: │
│ 3805 │ │ │ if self.columns.nlevels > 1: │
│ 3806 │ │ │ │ return self._getitem_multilevel(key) │
│ ❱ 3807 │ │ │ indexer = self.columns.get_loc(key) │
│ 3808 │ │ │ if is_integer(indexer): │
│ 3809 │ │ │ │ indexer = [indexer] │
│ 3810 │ │ else: │
│ │
│ /usr/local/lib/python3.10/dist-packages/pandas/core/indexes/base.py:3804 in get_loc │
│ │
│ 3801 │ │ │ try: │
│ 3802 │ │ │ │ return self._engine.get_loc(casted_key) │
│ 3803 │ │ │ except KeyError as err: │
│ ❱ 3804 │ │ │ │ raise KeyError(key) from err │
│ 3805 │ │ │ except TypeError: │
│ 3806 │ │ │ │ # If we have a listlike key, _check_indexing_error will raise │
│ 3807 │ │ │ │ # InvalidIndexError. Otherwise we fall through and re-raise │
╰──────────────────────────────────────────────────────────────────────────────────────────────────╯
KeyError: 203

Here it is my code:

Function that returns an untrained model to be trained

def model_init():
return AutoModelForSeq2SeqLM.from_pretrained(model_dir)

trainer = Seq2SeqTrainer(
model_init=model_init,
args=args,
train_dataset=train_df,
eval_dataset=eval_df,
data_collator=data_collator,
tokenizer=tokenizer,
compute_metrics=compute_metrics
)

---------------------- the Next Cell ---------------------------------

Start TensorBoard before training to monitor it in progress

%load_ext tensorboard
%tensorboard --logdir “/content/gdrive/MyDrive/parsinlu/runs”

---------------------- the Next Cell ---------------------------------

trainer.train()

I would appreciate if anybody could help me.

Topic		Replies	Views
KeyError:664 with Seq2Seq trainer() Beginners	0	436	July 11, 2023
Getting error - trainer.train() 🤗Transformers	4	978	June 3, 2024
Why am I getting KeyError: 'loss'? Beginners	9	16461	March 17, 2023
Getting KeyErrors when training Transformer Beginners	1	1526	June 21, 2022
KeyError: 'input_features' when running trainer.train() in Fine Tune Whisper Models	3	776	December 11, 2023

Getting KeyError: 203 when running trainer.train()

Function that returns an untrained model to be trained

Start TensorBoard before training to monitor it in progress

Related topics