Is there a reason why the new ModernBERT does not support the question answering task?
@tomaarsen Maybe you know?
Hello!
The reason why ModernBERT might not support the question answering task could be due to its design and intended use cases. Many newer BERT variants, including ModernBERT, are often optimized for specific tasks or have been fine-tuned for better performance on tasks like classification, summarization, or token tagging.
Question answering (QA) typically requires a model to be fine-tuned on datasets specifically labeled for QA tasks (like SQuAD, etc.), which might not have been the primary focus of ModernBERT’s training. If you’re interested in using ModernBERT for QA, you could potentially fine-tune it on a QA dataset yourself or explore other BERT variants like RoBERTa or DistilBERT, which have strong QA capabilities due to their specific training and optimization for such tasks.
I hope that helps clarify things!
Hi!
I’m talking about finetuning it on a QA dataset
But the class ModernBertForQuestionAnswering
, required for finetuning, simply does not exist.
Looks like there’s a SQUAD2 finetune already out there, but you need to turn on trust_remote_code
because they implement ModernBertForQuestionAnswering
themselves: modelling.py · Praise2112/ModernBERT-large-squad2-v0.1 at main.
It would be great if transformers
supported this natively!
That’s great, but note that “they” are not the official ModernBERT creators - the ModernBERT model repo does not have any modelling.py
script.
But good to have an example of a way one good manually set it up - would be great to have that modelling script on the official repos!