Fine-tune mt5 on Question Answering with run_qa

xrazor9 · February 22, 2021, 2:30pm

Hello everyone, I wanted to fine tune a mt5 model on QA with run_qa.py but it doesn’t work.
I’m a beginner so I have no idea what to do to solve the problem. Does anyone know how to make it work?

rgwatwormhill · February 24, 2021, 9:29pm

Are you sure MT5 can do Question Answering? Would BERT be better?

I can’t help, but I suggest you give a few more details: what happens when it “doesn’t work”?

CarlosPR · April 9, 2021, 7:09pm

Hello @xrazor9 , where you eventually able to do it? I am also a beginner and I want to fine tune mt5 for Q&A, and then and also for abstractive summarization (both in Spanish) but the tutorials I’ve found aren’t working for me.

Thanks in advance!

nielsr · August 25, 2021, 7:21am

Hi,

mT5 is, like T5, an encoder-decoder model. The run_qa.py script only supports encoder-only models (like BERT, RoBERTa, DistilBERT, etc.). This script only allows you to do extractive question answering (i.e. the model predicts start_positions and end_positions, indicating which tokens are at the start and the end of the answer).

However, you can fine-tune mT5 for question-answering. mT5 and T5 cast every NLP problem into a text-to-text format, so you can create training examples as:

input: 'question: context: ’
output: ‘’

We could add a run_qa_seq2seq.py script (which would be similar to the run_summarization.py and run_translation.py scripts), see supporting t5 for question answering · Issue #13029 · huggingface/transformers · GitHub.

Topic		Replies	Views
How to fine-tune mT5 model for QA task? Beginners	0	493	July 28, 2023
mT5/T5v1.1 Fine-Tuning Results Models	16	7533	March 8, 2022
Finetuning mt5 for question answering using run_qa_seq2seq Beginners	0	137	February 15, 2024
Convert mT5 to HF weights? 🤗Transformers	6	997	November 17, 2020
Summarization with mT5 🤗Transformers	1	1077	May 16, 2021

Fine-tune mt5 on Question Answering with run_qa

Related topics