Fine-tune mt5 on Question Answering with run_qa

Hello everyone, I wanted to fine tune a mt5 model on QA with run_qa.py but it doesn’t work.
I’m a beginner so I have no idea what to do to solve the problem. Does anyone know how to make it work?

1 Like

Are you sure MT5 can do Question Answering? Would BERT be better?

I can’t help, but I suggest you give a few more details: what happens when it “doesn’t work”?

Hello @xrazor9 , where you eventually able to do it? I am also a beginner and I want to fine tune mt5 for Q&A, and then and also for abstractive summarization (both in Spanish) but the tutorials I’ve found aren’t working for me.

Thanks in advance! :grin:

Hi,

mT5 is, like T5, an encoder-decoder model. The run_qa.py script only supports encoder-only models (like BERT, RoBERTa, DistilBERT, etc.). This script only allows you to do extractive question answering (i.e. the model predicts start_positions and end_positions, indicating which tokens are at the start and the end of the answer).

However, you can fine-tune mT5 for question-answering. mT5 and T5 cast every NLP problem into a text-to-text format, so you can create training examples as:

input: 'question: context: ’
output: ‘’

We could add a run_qa_seq2seq.py script (which would be similar to the run_summarization.py and run_translation.py scripts), see supporting t5 for question answering · Issue #13029 · huggingface/transformers · GitHub.