Strange error when using the Longformer (HuggingFace developers, please reply)

Hello,

I am trying to use LongformerForMultipleChoice model, and the code I am using is the following:

# import the pre-trained HuggingFace Longformer tokenizer.
longformer_tokenizer = LongformerTokenizer.from_pretrained('allenai/longformer-base-4096')

# get the pre-trained HuggingFace Longformer
best_model_longformer = LongformerForMultipleChoice.from_pretrained('allenai/longformer-base-4096',
                                                         output_hidden_states = True)

# my multiple choice question has 4 options.
question_list = [main_question, main_question, main_question, main_question]
option_list = [option1, option2, option3, option4]
mc_labels = torch.tensor([my_answer])

encoded_dict = longformer_tokenizer(question_list, option_list,
                             return_tensors = 'pt',
                             add_prefix_space = True,
                             padding = True)
input_hidden_state = best_model_longformer(
                       **{k: v.unsqueeze(0) for k,v in encoded_dict.items()}, 
                        labels = mc_labels)[2][0][:,:,:].detach()

and I am getting the error below:

/home/ec2-user/anaconda3/lib/python3.7/site-packages/transformers/modeling_longformer.py:71: UserWarning: This overload of nonzero is deprecated:
        nonzero()
Consider using one of the following signatures instead:
        nonzero(*, bool as_tuple) (Triggered internally at  /pytorch/torch/csrc/utils/python_arg_parser.cpp:766.)
  sep_token_indices = (input_ids == sep_token_id).nonzero()
Traceback (most recent call last):
  File "SEED_125_V20_15_LONGFORMER.py", line 427, in <module>
    main_function('/home/ec2-user/G1G2.txt','/home/ec2-user/G1G2_answer_num.txt', num_iter)
  File "SEED_125_V20_15_LONGFORMER.py", line 389, in main_function
    best_model_longformer)
  File "SEED_125_V20_15_LONGFORMER.py", line 198, in fill_MC_loss_accuracy_tensor
    input_hidden_state = best_model_longformer(**{k: v.unsqueeze(0) for k,v in encoded_dict.items()}, labels = mc_labels)[2][0][:,:,:].detach()
  File "/home/ec2-user/anaconda3/lib/python3.7/site-packages/torch/nn/modules/module.py", line 722, in _call_impl
    result = self.forward(*input, **kwargs)
  File "/home/ec2-user/anaconda3/lib/python3.7/site-packages/transformers/modeling_longformer.py", line 1808, in forward
    loss = loss_fct(reshaped_logits, labels)
  File "/home/ec2-user/anaconda3/lib/python3.7/site-packages/torch/nn/modules/module.py", line 722, in _call_impl
    result = self.forward(*input, **kwargs)
  File "/home/ec2-user/anaconda3/lib/python3.7/site-packages/torch/nn/modules/loss.py", line 948, in forward
    ignore_index=self.ignore_index, reduction=self.reduction)
  File "/home/ec2-user/anaconda3/lib/python3.7/site-packages/torch/nn/functional.py", line 2422, in cross_entropy
    return nll_loss(log_softmax(input, 1), target, weight, None, ignore_index, None, reduction)
  File "/home/ec2-user/anaconda3/lib/python3.7/site-packages/torch/nn/functional.py", line 2218, in nll_loss
    ret = torch._C._nn.nll_loss(input, target, weight, _Reduction.get_enum(reduction), ignore_index)
IndexError: Target 1 is out of bounds.

How can I fix this error?

Thank you.

It probably means yours class number dont match with the one labels in the trainer();

hello,

I am not sure what you mean by this. could you please elaborate more?

Thank you,

are you fine-tuning it?
I am also new to transformers but last two days I have been struggling with this error,
the issue was the my labels were [0,2] whereas the model was taking it as [0-1], so that was the error, once I resolved the label classes to [0,1] it worked.

my answers are coded in the right way, I don’ think that the Longformer is correctly getting that my multiple-choice questions have 4 options…

in my case I am struggling with cuda out of memory with longformer, I am using Google Colab pro

did you try this

I don’t think this has anything to do with my problem…but thanks for sharing

issue should be with mc_labels, labels should be between [0..num_choices-1]