Is it possible to use a BERT model that has been fine-tuned already (e.g. SQUAD-tuned BERT) on a masked LM task? I suspect that the sentence-completion model that is added on top of BERT is fundamentally incompatible with a masked LM task, but I’d like to know for a fact.
I’ve attempted to do this, using:
from transformers import AutoTokenizer, AutoModelForMaskedLM
tokenizer = AutoTokenizer.from_pretrained("deepset/bert-base-cased-squad2")
model = AutoModelForMaskedLM.from_pretrained("deepset/bert-base-cased-squad2")
but the results are very bad, so maybe I’m missing a step.
Thanks!