I don’t understand what you are trying to do. Are you wanting to do further fine-tuning (in which case you might want DistilBertForMaskedLM) or to classify your texts for example sentiment analysis (in which case you might want DistilBertForSequenceClassification).
Thanks for your reply. I’m trying to do an evaluation on an already fine-tuned model (reproduce the accuracy metrics).
If I try this with mrpc or stsb (with fine-tuned models, of course) it works just fine, but I don’t know what is the issue with MNLI.