Hi everybody,
I’m planning or training a BARTQA model, but before training I would like to test how to actually generate an answer at inference time. I’ve looked through the documentation, but I couldn’t find an obvious answer. Can I do it using the BARTForQuestionAnswering model or do I have to use the BARTForConditionalGeneration model that has the generate() method?
Solving question-answering using Transformers is usually done in one of 2 ways:
either extractive, where the model predicts start_scores and end_scores. In other words, the model predicts which token it believes is at the start of the answer, and which token is at the end of the answer. This was introduced in the original BERT paper.
either generative, where the model simply generates the correct answer. This was introduced in the T5 paper, where they treated every NLP problem as a generative task.
As BART is a seq2seq (encoder-decoder) model similar to T5, it makes sense to use BartForConditionalGeneration. You can indeed use the .generate() method at inference time to let it generate a predicted answer. However, BartForQuestionAnswering is also available in the library, meaning you can also use BART to do BERT-like extractive question answering.
If you ask me, option 2 is much simpler, and more “human-like”.
Isn’t BARTForConditionalGeneration trained for summarization, though? I’m going to finetune it but idk how much the model’s behaviour will change.
Thank you very much for your anwser!