Hi, I’m working with the brief tutorial given in the VisualBertForMultipleChoice section of the VisualBert page.
This is the code snippet I started with:
from transformers import AutoTokenizer, VisualBertForMultipleChoice
import torch
tokenizer = AutoTokenizer.from_pretrained("google-bert/bert-base-uncased")
model = VisualBertForMultipleChoice.from_pretrained("uclanlp/visualbert-vcr")
prompt = "In Italy, pizza served in formal settings, such as at a restaurant, is presented unsliced."
choice0 = "It is eaten with a fork and a knife."
choice1 = "It is eaten while held in the hand."
encoding = tokenizer([[prompt, prompt], [choice0, choice1]], return_tensors="pt", padding=True)
Which is able to run without issue. However, when I add an additional option to the choices:
from transformers import AutoTokenizer, VisualBertForMultipleChoice
import torch
tokenizer = AutoTokenizer.from_pretrained("google-bert/bert-base-uncased")
model = VisualBertForMultipleChoice.from_pretrained("uclanlp/visualbert-vcr")
prompt = "In Italy, pizza served in formal settings, such as at a restaurant, is presented unsliced."
choice0 = "It is eaten with a fork and a knife."
choice1 = "It is eaten while held in the hand."
choice2 = "It is eaten while torn into smaller pieces."
encoding = tokenizer([[prompt, prompt, prompt], [choice0, choice1, choice2]], return_tensors="pt", padding=True)
I get this following error:
TypeError: TextEncodeInput must be Union[TextInputSequence, Tuple[InputSequence, InputSequence]]
I am really stumped as to why this is happening. I’ve experimented with a few different ways to pass in the input sequences, but I continue to get this error. I’m confused because, looking at my input, is it not a list of lists which should be an accepted input?
I’ve spent quite a while on what feels like a simple issue I would really appreciate some help on this, thank you in advance!