T5 for closed book QA

How can I use T5 for abstractive QA, I don’t want to work on a SQUAD-like dataset, but rather get answers from general questions. Is there a prefix for this kind of QA for T5?

Thank you in advance!

Hi,

For open-domain question answering, no prefix is required. Google released several checkpoints (which you can find on our hub, such as this one) from their paper, you can use them as follows:

from transformers import AutoModelForSeq2SeqLM, AutoTokenizer

t5_qa_model = AutoModelForSeq2SeqLM.from_pretrained("google/t5-small-ssm-nq")
t5_tok = AutoTokenizer.from_pretrained("google/t5-small-ssm-nq")

input_ids = t5_tok("When was Franklin D. Roosevelt born?", return_tensors="pt").input_ids
gen_output = t5_qa_model.generate(input_ids)[0]

print(t5_tok.decode(gen_output, skip_special_tokens=True))
1 Like

Hi Nielsr, If I have my own Q/A dataset, is it possible to fine-tune the T5 model based on it, if it is possible do you have a ready-made code to start from?

Hi,

You can use the seq2seq QA script for that: transformers/trainer_seq2seq_qa.py at main · huggingface/transformers · GitHub

Thanks nielsr for the directions; however, I still have 2 enquiries and it will be great if you can help:

  1. The code provider is for an open book QA problem as it requires the context and in closed book problems, the context is not given as the model needs to answer from its memory:

python run_seq2seq_qa.py
–model_name_or_path t5-small
–dataset_name squad_v2
*–context_column context *
–question_column question
–answer_column answers
–do_train
–do_eval
–per_device_train_batch_size 12
–learning_rate 3e-5
–num_train_epochs 2
–max_seq_length 384
–doc_stride 128
–output_dir /tmp/debug_seq2seq_squad/

  1. Any idea if the script is using a prefix while training since as per the T5 paper, a prefix needs to be added to the training data so that the model concentrate on the required task (translation, summarization, Question Answering …)