Fine tunning QA model in SQUAD 2 dataset with more than one answer


I had been able to fine tune successfully some models for QA in custom SQUAD dataset using scripts like and from

But this scripts are made only for SQUAD datasets in which for every question, only one possible answer is available, because in the code you get this:

start_char = answers[“answer_start”][0]

targets = [answer[“text”][0] if len(answer[“text”]) > 0 else “” for answer in answers]

In my SQUAD dataset for some questions I could have several answers. How should I deal with this situation?

I have tried to concatenate the answers using some kind of special chars in order to convert several answers into one, as for example: “answer1 ##AND## answer2”. But this not run properly, never returning that kind of concatenated answers.

I’ve also tried converting the SQUAD dataset to a single answer, but in that case, for a question in the same context, the dataset contains several different disjoint answers, and I find this approach weird.

So how do you go about training with a custom SQUAD 2 dataset with multiple answers for the same question?

Thanks in advance!

1 Like

did you ever figure out the solution for multiple answers?