Hi everyone,
I get an error when using BertTokenizer
.
I do
encoding = tokenizer([[prompt, prompt, prompt], [choice0, choice1, choice2]], return_tensors='tf', padding=True))
and get
ValueError: too many values to unpack (expected 2)
.
When I do
encoding = tokenizer([[prompt, prompt], [choice0, choice1]], return_tensors='tf', padding=True)
it works. Any idea why? I want to fine-tune TFBertForMultipleChoice
such that each question ( prompt
) has three choices and not two as in the documentation BERT — transformers 4.7.0 documentation
Below is the complete code
import os
import numpy as np
import pandas as pd
import tensorflow as tf
from transformers import BertTokenizer, TFBertForMultipleChoice
tokenizer = BertTokenizer.from_pretrained('bert-base-uncased')
model = TFBertForMultipleChoice.from_pretrained('bert-base-uncased')
prompt = "Accept and check containers of mail from large volume mailers, couriers, and contractors."
choice0 = "Time Management"
choice1 = "Writing"
choice2 = "Reading Comprehension"
encoding = tokenizer([[prompt, prompt, prompt], [choice0, choice1, choice2]], return_tensors='tf', padding=True)
inputs = {k: tf.expand_dims(v, 0) for k, v in encoding.items()}
outputs = model(inputs) # batch size is 1
logits = outputs.logits
Thanks!
Ayala