I’m trying to follow the instructions in this page:
and I encounter an error when I try to train: RuntimeError: "nll_loss_forward_reduce_cuda_kernel_2d_index" not implemented for 'Int'
the error is in this function:
return torch._C._nn.cross_entropy_loss(input, target, weight, _Reduction.get_enum(reduction), ignore_index, label_smoothing
I tried to cast the label to ‘int64’ and got the same error (the cast works, when I try to change to float32 I see that the error is changed to float)
I’m running with
windows server 2022
python 3.10.10
pytorch 1.13.1
cuda 11.7 (the gpu is a10)
transformers 4.26.1
I’ve ran into the same issue in a similar environment (W11). In my case, even switching to CPU (setting the TrainingArguments parameter no_cuda to True) resulted in an error like this. In the Trainer, using the datasets with_format(‘torch’) made it work with and without CUDA enabled. Example below: