KeyError: 'input_features' when running trainer.train() in Fine Tune Whisper

I am working on Fine-Tune Whisper For Multilingual ASR with :hugs: Transformers by Sanchit Gandhi by following his blog, When I am training the model at trainer.train(). I am getting this error
β”‚
β”‚ /usr/local/lib/python3.10/dist-packages/transformers/trainer.py:1539 in train β”‚
β”‚ β”‚
β”‚ 1536 β”‚ β”‚ inner_training_loop = find_executable_batch_size( β”‚
β”‚ 1537 β”‚ β”‚ β”‚ self._inner_training_loop, self._train_batch_size, args.auto_find_batch_size β”‚
β”‚ 1538 β”‚ β”‚ ) β”‚
β”‚ ❱ 1539 β”‚ β”‚ return inner_training_loop( β”‚
β”‚ 1540 β”‚ β”‚ β”‚ args=args, β”‚
β”‚ 1541 β”‚ β”‚ β”‚ resume_from_checkpoint=resume_from_checkpoint, β”‚
β”‚ 1542 β”‚ β”‚ β”‚ trial=trial, β”‚
β”‚ β”‚
β”‚ /usr/local/lib/python3.10/dist-packages/transformers/trainer.py:1779 in _inner_training_loop β”‚
β”‚ β”‚
β”‚ 1776 β”‚ β”‚ β”‚ β”‚ rng_to_sync = True β”‚
β”‚ 1777 β”‚ β”‚ β”‚ β”‚
β”‚ 1778 β”‚ β”‚ β”‚ step = -1 β”‚
β”‚ ❱ 1779 β”‚ β”‚ β”‚ for step, inputs in enumerate(epoch_iterator): β”‚
β”‚ 1780 β”‚ β”‚ β”‚ β”‚ total_batched_samples += 1 β”‚
β”‚ 1781 β”‚ β”‚ β”‚ β”‚ if rng_to_sync: β”‚
β”‚ 1782 β”‚ β”‚ β”‚ β”‚ β”‚ self._load_rng_state(resume_from_checkpoint) β”‚
β”‚ β”‚
β”‚ /usr/local/lib/python3.10/dist-packages/accelerate/data_loader.py:377 in iter β”‚
β”‚ β”‚
β”‚ 374 β”‚ β”‚ dataloader_iter = super().iter() β”‚
β”‚ 375 β”‚ β”‚ # We iterate one batch ahead to check when we are at the end β”‚
β”‚ 376 β”‚ β”‚ try: β”‚
β”‚ ❱ 377 β”‚ β”‚ β”‚ current_batch = next(dataloader_iter) β”‚
β”‚ 378 β”‚ β”‚ except StopIteration: β”‚
β”‚ 379 β”‚ β”‚ β”‚ yield β”‚
β”‚ 380 β”‚
β”‚ β”‚
β”‚ /usr/local/lib/python3.10/dist-packages/torch/utils/data/dataloader.py:633 in next β”‚
β”‚ β”‚
β”‚ 630 β”‚ β”‚ β”‚ if self._sampler_iter is None: β”‚
β”‚ 631 β”‚ β”‚ β”‚ β”‚ # TODO(Bug in dataloader iterator found by mypy Β· Issue #76750 Β· pytorch/pytorch Β· GitHub) β”‚
β”‚ 632 β”‚ β”‚ β”‚ β”‚ self._reset() # type: ignore[call-arg] β”‚
β”‚ ❱ 633 β”‚ β”‚ β”‚ data = self._next_data() β”‚
β”‚ 634 β”‚ β”‚ β”‚ self._num_yielded += 1 β”‚
β”‚ 635 β”‚ β”‚ β”‚ if self._dataset_kind == _DatasetKind.Iterable and \ β”‚
β”‚ 636 β”‚ β”‚ β”‚ β”‚ β”‚ self._IterableDataset_len_called is not None and \ β”‚
β”‚ β”‚
β”‚ /usr/local/lib/python3.10/dist-packages/torch/utils/data/dataloader.py:677 in _next_data β”‚
β”‚ β”‚
β”‚ 674 β”‚ β”‚
β”‚ 675 β”‚ def _next_data(self): β”‚
β”‚ 676 β”‚ β”‚ index = self._next_index() # may raise StopIteration β”‚
β”‚ ❱ 677 β”‚ β”‚ data = self._dataset_fetcher.fetch(index) # may raise StopIteration β”‚
β”‚ 678 β”‚ β”‚ if self._pin_memory: β”‚
β”‚ 679 β”‚ β”‚ β”‚ data = _utils.pin_memory.pin_memory(data, self._pin_memory_device) β”‚
β”‚ 680 β”‚ β”‚ return data β”‚
β”‚ β”‚
β”‚ /usr/local/lib/python3.10/dist-packages/torch/utils/data/_utils/fetch.py:54 in fetch β”‚
β”‚ β”‚
β”‚ 51 β”‚ β”‚ β”‚ β”‚ data = [self.dataset[idx] for idx in possibly_batched_index] β”‚
β”‚ 52 β”‚ β”‚ else: β”‚
β”‚ 53 β”‚ β”‚ β”‚ data = self.dataset[possibly_batched_index] β”‚
β”‚ ❱ 54 β”‚ β”‚ return self.collate_fn(data) β”‚
β”‚ 55 β”‚
β”‚ in call:13 β”‚
β”‚ in :13 β”‚
╰──────────────────────────────────────────────────────────────────────────────────────────────────╯
KeyError: β€˜input_features’

I couldn’t find the solution for this, please help me to solve this isse.

Thanks in advance

2 Likes

Hi, have you got any solution?

No didn’t get solution

hello did you find the solution ?