How can I use T5 for abstractive QA, I don’t want to work on a SQUAD-like dataset, but rather get answers from general questions. Is there a prefix for this kind of QA for T5?
Thank you in advance!
How can I use T5 for abstractive QA, I don’t want to work on a SQUAD-like dataset, but rather get answers from general questions. Is there a prefix for this kind of QA for T5?
Thank you in advance!
Hi,
For open-domain question answering, no prefix is required. Google released several checkpoints (which you can find on our hub, such as this one) from their paper, you can use them as follows:
from transformers import AutoModelForSeq2SeqLM, AutoTokenizer
t5_qa_model = AutoModelForSeq2SeqLM.from_pretrained("google/t5-small-ssm-nq")
t5_tok = AutoTokenizer.from_pretrained("google/t5-small-ssm-nq")
input_ids = t5_tok("When was Franklin D. Roosevelt born?", return_tensors="pt").input_ids
gen_output = t5_qa_model.generate(input_ids)[0]
print(t5_tok.decode(gen_output, skip_special_tokens=True))
Hi Nielsr, If I have my own Q/A dataset, is it possible to fine-tune the T5 model based on it, if it is possible do you have a ready-made code to start from?
Hi,
You can use the seq2seq QA script for that: transformers/trainer_seq2seq_qa.py at main · huggingface/transformers · GitHub
Thanks nielsr for the directions; however, I still have 2 enquiries and it will be great if you can help:
python run_seq2seq_qa.py
–model_name_or_path t5-small
–dataset_name squad_v2
*–context_column context *
–question_column question
–answer_column answers
–do_train
–do_eval
–per_device_train_batch_size 12
–learning_rate 3e-5
–num_train_epochs 2
–max_seq_length 384
–doc_stride 128
–output_dir /tmp/debug_seq2seq_squad/