Fine tunning QA model in SQUAD 2 dataset with more than one answer

jrredondo · August 22, 2023, 10:53am

Hello.

I had been able to fine tune successfully some models for QA in custom SQUAD dataset using scripts like run_qa.py and run_seq2sqe_qa.py from https://github.com/huggingface/transformers/blob/main/examples/pytorch/question-answering/

But this scripts are made only for SQUAD datasets in which for every question, only one possible answer is available, because in the code you get this:

For run_qa.py:
start_char = answers[“answer_start”][0]

For run_seq2seq_qa.py:
targets = [answer[“text”][0] if len(answer[“text”]) > 0 else “” for answer in answers]

In my SQUAD dataset for some questions I could have several answers. How should I deal with this situation?

I have tried to concatenate the answers using some kind of special chars in order to convert several answers into one, as for example: “answer1 ##AND## answer2”. But this not run properly, never returning that kind of concatenated answers.

I’ve also tried converting the SQUAD dataset to a single answer, but in that case, for a question in the same context, the dataset contains several different disjoint answers, and I find this approach weird.

So how do you go about training with a custom SQUAD 2 dataset with multiple answers for the same question?

Thanks in advance!

sidvash · March 15, 2024, 3:53pm

did you ever figure out the solution for multiple answers?

madhusikha · November 6, 2024, 6:47pm

did you solve the issue?

Topic		Replies	Views
Custom SQuAD2.0 dataset gives an error when using run_qa.py script 🤗Transformers	3	3429	July 30, 2021
How to understand the answer_start parameter of Squad dataset for training BERT-QA model + practical implications for creating custom dataset? Intermediate	1	1004	September 1, 2023
Question-Answering/Text-generation/Summarizing: Fine-tune on multiple answers Beginners	8	5279	November 20, 2021
[Question Answering] Why SQuaD training set only contrains one possible answer in each sample 🤗Datasets	0	550	October 14, 2022
Train modell for Question Answering Intermediate	3	314	May 6, 2024

Fine tunning QA model in SQUAD 2 dataset with more than one answer

Related topics