Poor performance in zero-shot learning when using the model 'typeform/distilbert-base-uncased-mnli'

valkyrie · May 25, 2021, 5:23pm

Hi,

I have tried to use the model ‘typeform/distilbert-base-uncased-mnli’ in zero-shot classification (multi-class, not multi-label). However, I am getting very poor results especially when compared to using the model ‘facebook/bart-large-mnli’. I have used both the zero-shot classification pipeline and without it, and the results are still just as bad.

The test dataset I am trying to classify into categories has 46 entries and 13 categories, and the accuracy I am getting with the DistilBERT MNLI model is around 15%, whereas this goes up to 57% with the Bart large MNLI model.

Has anyone else also found such a massive difference in performance when using a distilled model for this task? I assumed it would be comparable since DistilBERT and BERT have comparable performances for many NLP tasks, but the results are too different.

Also, does anyone have any benchmark results for these two models, either in NLI tasks or zero-shot classification tasks?

Thank you!

hhschu · May 27, 2021, 9:28am

Hello,

I am the one who fine-tuned this model. The original DistilBERT paper reports 82.2 on accuracy in the MNLI task while BERT-base has 86.7 accuracy. Other following papers show slightly different numbers but in the same ballpark. For example, the MobileBERT paper reports 81.5 and 84.6 on accuracy on DistilBERT and BERT-base respectively.

In my fine-tuning, I got 82 accuracy for both MNLI and MNLI-mm. I use the run_glue.py (huggingface/transformers/blob/master/examples/pytorch/text-classification/run_glue.py) script to fine-tune the model with these hyperparameters:

max_seq_length: 128
per_device_train_batch_size: 16
learning_rate: 2e-5
num_train_epochs: 5

When running this model on our own very small zero-shot classification test data, we didn’t see a big drop in accuracy, but we did observe that the model is less “certain” on the correct answer, i.e., it returns a lower probability on the correct label.

You can also try our fine-tuned MobileBERT. It has a marginally better result in our testing.

valkyrie · May 27, 2021, 9:55am

Hi @hhschu,

Thank you for the information provided with regards to this model, it is incredibly useful. I also noticed that the scores given with this model were lower than for other models, which is odd.

I will try to use the fine-tuned MobileBERT then and see whether the results improve. What accuracy did you obtain from fine-tuning MobileBERT on MNLI and MNLI-mm, if that is alright to ask?

hhschu · May 27, 2021, 10:33am

@valkyrie We got 84 accuracy on MobileBERT. Qualitatively, it’s still much worse than RoBERTa (91 accuracy on MNLI in our experiment) on our zero-shot test data, in terms of certainty of the correct label. But it’s better than DistilBERT.

valkyrie · May 27, 2021, 11:41am

@hhschu thank you again. Just to confirm, on your zero-shot classification experiments you used it to obtain one class per entry (i.e. setting multi-label=False) instead of a multi-label scenario, is that right? Since this is how I am using it at the moment.

hhschu · May 27, 2021, 1:05pm

@valkyrie our one-shot data is single-label, indeed. In this case, turning multi-label on/off doesn’t make much difference in accuracy in my opinion. Multi-label off is just a softmax layer on top of multi-label on’s probabilities after all. The top label is the same.

valkyrie · May 28, 2021, 10:05am

@hhschu thank you for all your help with this, I really appreciate your input and the information you’ve given me.

Topic		Replies	Views
Improving Zero-shot accuracy Intermediate	0	943	March 31, 2022
Bart-large-mnli zero-shot learning fine tuning problems Beginners	0	674	July 18, 2023
Zero-shot and distillation - Improved distilled model over teacher model Research	0	1101	March 18, 2021
Training classifier with frozen DistilBERT embeddings Beginners	5	3442	January 24, 2025
Zero-shot classification fine-tuning Beginners	2	1188	March 18, 2022

Poor performance in zero-shot learning when using the model 'typeform/distilbert-base-uncased-mnli'

Related topics