Error from CUDA on audio classification

dmoti · March 7, 2023, 10:37am

Hi,

I’m trying to follow the instructions in this page:

and I encounter an error when I try to train:
RuntimeError: "nll_loss_forward_reduce_cuda_kernel_2d_index" not implemented for 'Int'

the error is in this function:
return torch._C._nn.cross_entropy_loss(input, target, weight, _Reduction.get_enum(reduction), ignore_index, label_smoothing

I tried to cast the label to ‘int64’ and got the same error (the cast works, when I try to change to float32 I see that the error is changed to float)

I’m running with
windows server 2022
python 3.10.10
pytorch 1.13.1
cuda 11.7 (the gpu is a10)
transformers 4.26.1

thanks

dmoti · March 8, 2023, 1:24pm

I’ve found a workaround, I installed WSL and run everything from there, it works

xavialex · August 19, 2023, 7:21pm

Hi @dmoti:

I’ve ran into the same issue in a similar environment (W11). In my case, even switching to CPU (setting the TrainingArguments parameter no_cuda to True) resulted in an error like this. In the Trainer, using the datasets with_format(‘torch’) made it work with and without CUDA enabled. Example below:

trainer = Trainer(
    model,
    training_args,
    train_dataset=dataset_encoded["train"].with_format("torch"),
    eval_dataset=dataset_encoded["test"].with_format("torch"),
    tokenizer=feature_extractor,
    compute_metrics=compute_metrics,
)

Hope this helps.

codekingwu · September 18, 2024, 5:11am

Thank you so much

Topic		Replies	Views
Unit 4. build a music genre classifier Models	0	66	June 16, 2024
RuntimeError: CUDA error: device-side assert triggered CUDA kernel errors might be asynchronously reported at some other API call, so the stacktrace below might be incorrect. For debugging consider passing CUDA_LAUNCH_BLOCKING=1 Compile with `TORCH_USE_CU 🤗Transformers	2	1035	November 1, 2024
CUDA RunTime Error during ASR training 🤗Transformers	0	865	August 7, 2022
Mask2Former: CUDA training Models	5	688	July 30, 2023
HugginFace dataset error: RuntimeError: Input type (torch.FloatTensor) and weight type (torch.cuda.HalfTensor) should be the same or input should be a MKLDNN tensor and weight is a dense tensor 🤗Datasets	3	11486	May 6, 2022

Error from CUDA on audio classification

Related topics