Error when using the forward() function of `LongformerLayer` class

Hello,

Sorry if my question sounds a bit silly, but I just have a question:
I am trying to use LongformerForMultipleChoice model for a multiple-choice question that has 4 options.

When I do:

my_Longformer_multiple_choice_model.encoder.layer[layer_index].forward(hidden_output, 
                                                 attention_mask=my_attention_mask,output_attention=False)

, an this error is generated:


  File "/Users/hyunjindominiquecho/opt/anaconda3/lib/python3.7/site-packages/transformers/modeling_longformer.py", line 384, in _sliding_chunks_query_key_matmul
    batch_size, seq_len, num_heads, head_dim = query.size()

ValueError: too many values to unpack (expected 4)

Here, my_attention_mask is the same attention mask that I would specify under the regular LongformerForMultipleChoice command:

# I am using the LongformerForMultipleChoice model, where each multiple choice question has 4 options.
my_attention_mask = tensor([[[1, 1, 1,  ..., 0, 0, 0],
         [1, 1, 1,  ..., 0, 0, 0],
         [1, 1, 1,  ..., 0, 0, 0],
         [1, 1, 1,  ..., 0, 0, 0]]])
# I can use the my_attention_mask in the regular command as below:
longformer_output= my_Longformer_multiple_choice_model(input_ids=input_ids,....,attention_mask=my_attention_mask)

why is this value error generated? What should I pass for the attention_mask parameter in the command my_Longformer_multiple_choice_model.encoder.layer[layer_index].forward(hidden_output, attention_mask,output_attention=False)?

Thank you,

That seems to me error more related to the data structures rather than the longformer, may be you should check the format of input data first.

Yes, and this is why I am asking the Huggingface staffs what I should pass as the attention_mask parameter in my command.

yes, in my case I passed a list and then pass list to the inputs:

inputs = tokenizer.encode_plus(example.text_a, example.text_b, add_special_tokens=True, max_length=max_length, pad_to_max_length=True)

    input_ids , attention_mask = inputs["input_ids"] , inputs["attention_mask"]

not sure if that is the issue with yours work,
can you please let me know what is yours batch size and which environment is better for such model, I am using a tiny dataset otherwise it crashes the google colab pro,

Please note that I am doing
my_Longformer_multiple_choice_model.encoder.layer[layer_index].forward(hidden_output, attention_mask,output_attention=False).

Please read my question carefully. my_attention_mask can be used in the command
my_Longformer_multiple_choice_model(input_ids=input_ids,....,attention_mask=my_attention_mask) without an error.

You are not doing
my_Longformer_multiple_choice_model.encoder.layer[layer_index].forward().

This has to do with HuggingFace code for the .encoder.layer.

Please disregard this post… I can’t quite delete it.

Python functions can return multiple variables . These variables can be stored in variables directly. This is a unique property of Python , other programming languages such as C++ or Java do not support this by default.

The valueerror: too many values to unpack occurs during a multiple-assignment where you either don’t have enough objects to assign to the variables or you have more objects to assign than variables. If for example myfunction() returned an iterable with three items instead of the expected two then you would have more objects than variables necessary to assign to.