Pegasus on qa task

Hello everyone,

I try to use pegasus on squad dataset. I am using notebook for pegasus fine-tuning on xsum.

I use all hyper parameters as in the above notebook. After train, I got following loss values.

/usr/local/lib/python3.7/dist-packages/transformers/optimization.py:562: UserWarning: This overload of add_ is deprecated: add_(Number alpha, Tensor other) Consider using one of the following signatures instead: add_(Tensor other, *, Number alpha) (Triggered internally at /pytorch/torch/csrc/utils/python_arg_parser.cpp:1005.) exp_avg_sq_row.mul_(beta2t).add_(1.0 - beta2t, update.mean(dim=-1))

[2000/2000 46:21, Epoch 2000/2000]

Step Training Loss Validation Loss
100 11.515400 11.971568
200 9.583800 10.400596
300 7.782900 8.812467
400 1.653800 1.603077
500 0.006000 0.367762
600 0.049500 0.271221
700 0.000600 0.271198
800 0.000300 0.282740
900 0.000200 0.311012
1000 0.000200 0.308879
1100 0.000100 0.316374
1200 0.000100 0.329436
1300 0.000100 0.326996
1400 0.020200 0.291157
1500 0.002400 0.272543
1600 0.000100 0.291934
1700 0.035900 0.308235
1800 0.000100 0.308755
1900 0.000100 0.311096
2000 0.000000 0.311382

CPU times: user 1h 1min 52s, sys: 18min 49s, total: 1h 20min 41s Wall time: 46min 26s

TrainOutput(global_step=2000, training_loss=1.8480298345236805, metrics={‘train_runtime’: 2783.2468, ‘train_samples_per_second’: 0.719, ‘total_flos’: 0, ‘epoch’: 2000.0, ‘init_mem_cpu_alloc_delta’: 8192, ‘init_mem_gpu_alloc_delta’: 0, ‘init_mem_cpu_peaked_delta’: 0, ‘init_mem_gpu_peaked_delta’: 0, ‘train_mem_cpu_alloc_delta’: -4224876544, ‘train_mem_gpu_alloc_delta’: 2288025088, ‘train_mem_cpu_peaked_delta’: 4242300928, ‘train_mem_gpu_peaked_delta’: 8677123584})

My question is what values for hyperparameters (optimization, learning rate etc.) I should use for a QA task? And the training loss shown in TrainOutput does not match with the loss values on table. What can the reason be?

hey @helloworld123-lab, are you sure that pegasus can be used for a reading comprehension task like SQuAD? my understanding is that it is designed for abstractive tasks like summarization (although i’d be interested to hear otherwise!).

as an alternative, i’d suggest checking out the official question-answering tutorial here: Google Colaboratory

Thank you @lewtun for your reply. Actually I’m not sure about that. I could not find a source about this either. However, I checked the model in here, and I wonder how this could happen with pegasus or bart.

ah so that notebook is about T5 which does support question-answering via it’s special text2text formulation. perhaps a better question from me would be: what specific task are you trying to solve?

if you want to use BART, my suggestion would to see if one of the checkpoints on the hub is compatible with the tutorial i linked to earlier :slight_smile:

Thank you, I am just a beginner and working on QA. I just started training the notebook you shared with Bart by passing its checkpoints.
I have one more question :slight_smile: I just saw some notebooks on Bert2bert/RobertaShared etc. on cnn-dailymail for summarization. Can I use such models on QA systems?

as far as i know, you can’t use models that are fine-tuned for summarisation directly for question-answering as the tasks are quite different:

  • in the summarisation case, the model is provided with (document, summary) tuples
  • in the question answering case, the model is provided with (question, document, answers) tuples (with potentially more subfields like answer_start if we’re dealing with the extractive case where we need to identify a span of text where the answer exists)

so the main problem i see with naively using a summarisation model for QA is that the model does not expect a question / query in its inputs and so will not generate an answer.

btw if you’re looking to build an end-to-end QA system, there’s a neat library called haystack that is based on transformers and provides a lot of nice functionality to store documents, query them etc: https://haystack.deepset.ai/

1 Like