Evaluating multiple choices using BertForMultipleChoice

Hello community.
I want to use BertForMultipleChoice to, well, answer a multiple choice question.
However, the example script provided on

only allows for two choices. I want to evaluate the best answer of 4 choices. Here is my script and output (using cpu only)

Any help would be greatly appreciated. I followed the error messages and considered doing some library surgery, but could not figure out where to flip the switch from “only allows 2 input sentences” to “allows more than 2 input sentences”.

Input:
from transformers import BertTokenizer, BertForMultipleChoice
import tensorflow as tf
import torch

tokenizer = BertTokenizer.from_pretrained(‘cl-tohoku/bert-base-japanese’)
model = BertForMultipleChoice.from_pretrained(‘cl-tohoku/bert-base-japanese’)
prompt = “人気作家A氏の講演会が無料[MASK]、多くのファンが詰めかけた。”
choice0 = “にして”
choice1 = “にあって”
choice2 = “として”
choice3 = “とあって”
labels = torch.tensor(0).unsqueeze(0)
encoding = tokenizer([[prompt, prompt, prompt, prompt], [choice0, choice1, choice2, choice3]], return_tensors=‘pt’, padding=True)
outputs = model(**{k: v.unsqueeze(0) for k,v in encoding.items()}, labels=labels)
loss = outputs.loss
logits = outputs.logits
print(logits)

Error Message:
Traceback (most recent call last):
File “pretrain.py”, line 13, in
encoding = tokenizer([[prompt, prompt, prompt, prompt], [choice0, choice1, choice2, choice3]], return_tensors=‘pt’, padding=True)
File “/home/eo/anaconda3/lib/python3.8/site-packages/transformers/tokenization_utils_base.py”, line 2289, in call
return self.batch_encode_plus(
File “/home/eo/anaconda3/lib/python3.8/site-packages/transformers/tokenization_utils_base.py”, line 2474, in batch_encode_plus
return self._batch_encode_plus(
File “/home/eo/anaconda3/lib/python3.8/site-packages/transformers/tokenization_utils.py”, line 543, in _batch_encode_plus
ids, pair_ids = ids_or_pair_ids
ValueError: too many values to unpack (expected 2)

1 Like