RuntimeError when Training starts: expected scalar type Long but found Int

Hey Guys,
I’m trying to adapt the notebook [Audio-Classification-on-Keyword-Spotting.ipynb](https://colab.research.google.com/github/huggingface/notebooks/blob/master/examples/audio_classification.ipynb) using a custom dataset.
The dataset looks like

DatasetDict({
    train: Dataset({
        features: ['path', 'file', 'emotion', 'label', 'audio'],
        num_rows: 271
    })
    eval: Dataset({
        features: ['path', 'file', 'emotion', 'label', 'audio'],
        num_rows: 34
    })
    test: Dataset({
        features: ['path', 'file', 'emotion', 'label', 'audio'],
        num_rows: 34
    })
})

The features look like

{'path': Value(dtype='string', id=None),
 'file': Value(dtype='string', id=None),
 'emotion': Value(dtype='string', id=None),
 'label': ClassLabel(num_classes=4, names=['angry', 'happy', 'neutral', 'sad'], id=None),
 'audio': {'array': Sequence(feature=Value(dtype='float32', id=None), length=-1, id=None),
  'sampling_rate': Value(dtype='int64', id=None)}}

I can execute the notebook perfectly until I get to the model training. There I keep getting the error message when the loss is calculated. As the base model for fine tuning I tried “superb/hubert-large-superb-er” and “facebook/wav2vec2-base”. Bot end with the same error message.

RuntimeError: expected scalar type Long but found Int

Most likely this is a very basic issue but I have no clue how to fix it. Can anybody help me with this, please?

Best regards, Andy

Full error message

---------------------------------------------------------------------------
RuntimeError                              Traceback (most recent call last)
~\AppData\Local\Temp/ipykernel_24052/4032920361.py in <module>
----> 1 trainer.train()

~\Anaconda3\lib\site-packages\transformers\trainer.py in train(self, resume_from_checkpoint, trial, ignore_keys_for_eval, **kwargs)
   1398                         tr_loss_step = self.training_step(model, inputs)
   1399                 else:
-> 1400                     tr_loss_step = self.training_step(model, inputs)
   1401 
   1402                 if (

~\Anaconda3\lib\site-packages\transformers\trainer.py in training_step(self, model, inputs)
   1982 
   1983         with self.autocast_smart_context_manager():
-> 1984             loss = self.compute_loss(model, inputs)
   1985 
   1986         if self.args.n_gpu > 1:

~\Anaconda3\lib\site-packages\transformers\trainer.py in compute_loss(self, model, inputs, return_outputs)
   2014         else:
   2015             labels = None
-> 2016         outputs = model(**inputs)
   2017         # Save past state if it exists
   2018         # TODO: this needs to be fixed and made cleaner later.

~\Anaconda3\lib\site-packages\torch\nn\modules\module.py in _call_impl(self, *input, **kwargs)
   1100         if not (self._backward_hooks or self._forward_hooks or self._forward_pre_hooks or _global_backward_hooks
   1101                 or _global_forward_hooks or _global_forward_pre_hooks):
-> 1102             return forward_call(*input, **kwargs)
   1103         # Do not call functions when jit is used
   1104         full_backward_hooks, non_full_backward_hooks = [], []

~\Anaconda3\lib\site-packages\transformers\models\hubert\modeling_hubert.py in forward(self, input_values, attention_mask, output_attentions, output_hidden_states, return_dict, labels)
   1318         if labels is not None:
   1319             loss_fct = CrossEntropyLoss()
-> 1320             loss = loss_fct(logits.view(-1, self.config.num_labels), labels.view(-1))
   1321 
   1322         if not return_dict:

~\Anaconda3\lib\site-packages\torch\nn\modules\module.py in _call_impl(self, *input, **kwargs)
   1100         if not (self._backward_hooks or self._forward_hooks or self._forward_pre_hooks or _global_backward_hooks
   1101                 or _global_forward_hooks or _global_forward_pre_hooks):
-> 1102             return forward_call(*input, **kwargs)
   1103         # Do not call functions when jit is used
   1104         full_backward_hooks, non_full_backward_hooks = [], []

~\Anaconda3\lib\site-packages\torch\nn\modules\loss.py in forward(self, input, target)
   1148 
   1149     def forward(self, input: Tensor, target: Tensor) -> Tensor:
-> 1150         return F.cross_entropy(input, target, weight=self.weight,
   1151                                ignore_index=self.ignore_index, reduction=self.reduction,
   1152                                label_smoothing=self.label_smoothing)

~\Anaconda3\lib\site-packages\torch\nn\functional.py in cross_entropy(input, target, weight, size_average, ignore_index, reduce, reduction, label_smoothing)
   2844     if size_average is not None or reduce is not None:
   2845         reduction = _Reduction.legacy_get_string(size_average, reduce)
-> 2846     return torch._C._nn.cross_entropy_loss(input, target, weight, _Reduction.get_enum(reduction), ignore_index, label_smoothing)
   2847 
   2848 

RuntimeError: expected scalar type Long but found Int

If somebody stumbles into the same isssue -
In the meantime I was able to figure it out. The problem was that I was executing the code locally without GPU. That broke the execution at some point. I was however able to run the code on Colab and and Kaggle with GPU. My assumption was that Huggingface is dealing with the circumstance that there is no GPU available. But that doesn’t seem to be the case at least in my setup.
Have fun!