Encoding sentence pair with BERT cause ValueError: not enough values to unpack (expected 2, got 1)

Solved!

Apparently the ‘input_ids’, ‘attention_mask’, ‘token_type_ids’ all needs to be of size
(batch_size, sequence_length) , so when I used

.unsqueeze(0)

instead of

.squeeze(0)

it worked.
In addition, the tokenizer should be added with the parameter: is_split_into_words=True ,
to avoid ambiguity with a batch of sequences.

Hope it helps to others stuck on the same thing.

2 Likes