How to do multi-span question answering?

The BertForQuestionAnswering architecture is perfect for single span question answering, i.e., extracting a single span (as the answer) from (i) a context & (ii) a question. It outputs two tensors: start_logits & end_logits.

But what if the "train" split of the dataset provides multiple spans as the answer? (In this case, the objective is to predict multiple spans as the answer for a single example.)

What architecture works for this problem? Do we have to create something custom, or does HF Transformers provide a class (similar to BertForQuestionAnswering)?


A paper showed that you can treat multi-span question answering as token classification. They got really nice results with it:

So basically you can treat a multi-span QA problem as a sequence labeling problem, hence you can use BertForTokenClassification.

1 Like