Hello everyone, I wanted to fine tune a mt5 model on QA with run_qa.py but it doesn’t work.
I’m a beginner so I have no idea what to do to solve the problem. Does anyone know how to make it work?
Are you sure MT5 can do Question Answering? Would BERT be better?
I can’t help, but I suggest you give a few more details: what happens when it “doesn’t work”?
Hello @xrazor9 , where you eventually able to do it? I am also a beginner and I want to fine tune mt5 for Q&A, and then and also for abstractive summarization (both in Spanish) but the tutorials I’ve found aren’t working for me.
Thanks in advance!
Hi,
mT5 is, like T5, an encoder-decoder model. The run_qa.py
script only supports encoder-only models (like BERT, RoBERTa, DistilBERT, etc.). This script only allows you to do extractive question answering (i.e. the model predicts start_positions
and end_positions
, indicating which tokens are at the start and the end of the answer).
However, you can fine-tune mT5 for question-answering. mT5 and T5 cast every NLP problem into a text-to-text format, so you can create training examples as:
input: 'question: context: ’
output: ‘’
We could add a run_qa_seq2seq.py
script (which would be similar to the run_summarization.py
and run_translation.py
scripts), see supporting t5 for question answering · Issue #13029 · huggingface/transformers · GitHub.