Batch[k] = torch.tensor([f[k] for f in features]) ValueError: expected sequence of length 3 at dim 1 (got 4)

Hi there,

I am trying to build a multiple-choice question solver and I am getting the following error.
Any thoughts what could be the cause of this error?

  File "../src/run_multiple_choice.py", line 195, in main
    model_path=model_args.model_name_or_path if os.path.isdir(model_args.model_name_or_path) else None
  File "/usr/local/lib/python3.7/site-packages/transformers/trainer.py", line 755, in train
    for step, inputs in enumerate(epoch_iterator):
  File "/usr/local/lib/python3.7/site-packages/torch/utils/data/dataloader.py", line 363, in __next__
    data = self._next_data()
  File "/usr/local/lib/python3.7/site-packages/torch/utils/data/dataloader.py", line 403, in _next_data
    data = self._dataset_fetcher.fetch(index)  # may raise StopIteration
  File "/usr/local/lib/python3.7/site-packages/torch/utils/data/_utils/fetch.py", line 47, in fetch
    return self.collate_fn(data)
  File "/usr/local/lib/python3.7/site-packages/transformers/data/data_collator.py", line 65, in default_data_collator
    batch[k] = torch.tensor([f[k] for f in features])
ValueError: expected sequence of length 3 at dim 1 (got 4)

Looks like my instances were not of the same size. Making them the same size fixes the problem.

Hi, I’m running into the same problem. Processing for my custom multiple choice question answering task throws the same value error. Do you mind explaining how you resolved this problem?

I just fixed this error by setting my tokenizer’s padding to ‘max_length’.

1 Like

hi, how can I run the multi choice QA model if I have only two labels (1 and 2). when I am passing num_labels=2 in the AutoModelForMultipleChoice, the logits are having dims (nx4), but I need (nx2) since I have only two labels. How can I solve this problem?

tutorial I referred: Multiple choice
dataset used: art · Datasets at Hugging Face

This is solved. Thanks

I solved using. padding=‘max_length’ on tokens = tokenizer(text ,padding=‘max_length’)