Project: Create a new zero-shot model with NLI data

MoritzLaurer · November 16, 2021, 8:45am

Description

The zero-shot classification pipeline has becomes very popular on Hugging Face. It allows you to classify a text in any category without having to fine-tune a model for the specific classification task you are interested in.
The zero-shot pipeline is based on models trained on Natural Language Inference (NLI). This project will train a new NLI model, which can then be used in the zero-shot classification pipeline.

Model(s)

Any base-model can be used. Since there are already several NLI models on the model hub, I suggest to train a new model based on Microsoft’s DeBERTa-v3 model. Version three was only published few weeks ago and can outperform larger models (see an example here).
We can probably create a new SOTA NLI model with the new DeBERTa version and enough NLI data.

Datasets

Established NLI datasets include:
MultiNLI
SNLI
ANLI

Other interesting NLI datasets include:
FEVER-NLI
DocNLI
LingNLI
More datasets can be included!

Challenges

NLI models can be trained as either 3-class classifiers (entailment/neutral/contradiction) or as 2-class classifiers (entailment/not_entailment). Both setups have different advantages and disadvantages
There is a lot of NLI data (2 mio++ texts in the datasets linked above), which makes training computationally expensive. Optimising the training pipeline is a challenge.
Many different datasets can be translated into NLI-format. Including more datasets can be beneficial, but requires manual transformation of datasets.

Desired project outcomes

Create a Streamlit or Gradio app on Spaces that provides an interface for zero-shot classification with a new NLI model in the backend.

Additional resources

See the links to the datasets above. Also see Joe Davidson’s original blog post on the zeroshot pipeline

Discord channel

To chat and organise with other people interested in this project, head over to our Discord and:

Follow the instructions on the #join-course channel
Join the #zero-shot channel

Just make sure you comment here to indicate that you’ll be contributing to this project

HarrySaini · November 16, 2021, 12:51pm

Hyee! I’d love to contribute in this one. I guess further discussion will take place in Discord?

lewtun · November 16, 2021, 1:44pm

Hey @HarrySaini, yep Discord is probably the best place to coordinate / discuss more efficiently

Rajaram1996 · November 16, 2021, 3:45pm

Interested. Messaged in Discord as well.

adorkin · November 18, 2021, 5:35pm

I happen to have trained a bilingual MNLI model recently, so I decided to give this project a shot as well (without taking a place on the team). In theory this should work on English and Russian. Any feedback is very welcome.

jordiclive · January 31, 2022, 6:09pm

@MoritzLaurer was there any update on this? I would be very interested in a superior model to facebook/bart-large-mnli

MoritzLaurer · January 31, 2022, 6:23pm

yeah, this one is better than bart-large-mnli: MoritzLaurer/DeBERTa-v3-base-mnli-fever-anli · Hugging Face (or this one: MoritzLaurer/DeBERTa-v3-base-mnli-fever-docnli-ling-2c · Hugging Face) and they should also be faster

jordiclive · February 5, 2022, 5:55pm

Great! Thanks, I will be comparing them for zero-shot text classification.

MaziyarPanahi · April 3, 2023, 8:35am

Hi @MoritzLaurer
Is there a way to update the first post with some example notebooks on how to take any transformer-based models and one or more NLI-based datasets and fine-tune a new Zero-Shot text classifier? (some architectures are missing from the Models Hub like BERT, I would like to add those)

MaziyarPanahi · April 11, 2023, 7:47am

I used the code in the run_xnli script and it worked well!

Topic		Replies	Views
Zero shot learning classification Beginners	2	909	December 8, 2020
New pipeline for zero-shot text classification 🤗Transformers	107	71679	February 17, 2025
Zero shot classification with manual pytorch Beginners	0	720	August 27, 2021
Zero-shot Classification With Generative Language Models 🤗Transformers	0	710	October 12, 2023
Bart-large-mnli zero-shot learning fine tuning problems Beginners	0	675	July 18, 2023