Error from CUDA on audio classification

Hi,

I’m trying to follow the instructions in this page:

and I encounter an error when I try to train:
RuntimeError: "nll_loss_forward_reduce_cuda_kernel_2d_index" not implemented for 'Int'

the error is in this function:
return torch._C._nn.cross_entropy_loss(input, target, weight, _Reduction.get_enum(reduction), ignore_index, label_smoothing

I tried to cast the label to ‘int64’ and got the same error (the cast works, when I try to change to float32 I see that the error is changed to float)

I’m running with
windows server 2022
python 3.10.10
pytorch 1.13.1
cuda 11.7 (the gpu is a10)
transformers 4.26.1

thanks

I’ve found a workaround, I installed WSL and run everything from there, it works

Hi @dmoti:

I’ve ran into the same issue in a similar environment (W11). In my case, even switching to CPU (setting the TrainingArguments parameter no_cuda to True) resulted in an error like this. In the Trainer, using the datasets with_format(‘torch’) made it work with and without CUDA enabled. Example below:

trainer = Trainer(
    model,
    training_args,
    train_dataset=dataset_encoded["train"].with_format("torch"),
    eval_dataset=dataset_encoded["test"].with_format("torch"),
    tokenizer=feature_extractor,
    compute_metrics=compute_metrics,
)

Hope this helps.

5 Likes

Thank you so much