Hello! I have a beginner question. I am trying to create a model that makes predictions on the QAngaroo dataset with DistilBert. In this dataset, we get a list of supports and some candidate answers (between 2~100), and we need to choose the right answer for the model. Right now, I am trying to use TFDistilBertForMultipleChoice, but I am running into a problem since num_choices is a value that is fixed with the entire batch size. I was wondering how I could go about making that value dynamic.
I also had some doubts about the way input goes into the model, from the code example at https://huggingface.co/transformers/master/model_doc/distilbert.html#transformers.TFDistilBertForMultipleChoice.call
encoding = tokenizer([[prompt, prompt], [choice0, choice1]], return_tensors='tf', padding=True)
inputs = {k: tf.expand_dims(v, 0) for k, v in encoding.items()}
outputs = model(inputs) # batch size is 1
Here we put prompt down one time for each choice, but wonât this result in a really slow model if there are lots of choices that all share the same prompt?
Can anyone help me understand how to fix this, or tell me if Iâm going about it the wrong way? Iâm starting to think multiple choice isnât the right way to go about it, and it would be better to ignore the choices given and use a Question Answering model instead, since all of the choices are contained somewhere in the input. But this seems a bit âwrongâ to me, since many choices appear several times in the text and the question answering model only takes in 1 start position/end position from the text, and the correct entity might occur several times in the text. How can training work if it needs to predict the right location for a string that appears several times? Especially when there isnât really a correct location in the first place, since the QAngaroo dataset tests multi-hop reasoning, which I wouldnât expect to be associated to any single occurrence of the answer string.
Any help would be greatly appreciated!