Description
The zero-shot classification pipeline has becomes very popular on Hugging Face. It allows you to classify a text in any category without having to fine-tune a model for the specific classification task you are interested in.
The zero-shot pipeline is based on models trained on Natural Language Inference (NLI). This project will train a new NLI model, which can then be used in the zero-shot classification pipeline.
Model(s)
Any base-model can be used. Since there are already several NLI models on the model hub, I suggest to train a new model based on Microsoft’s DeBERTa-v3 model. Version three was only published few weeks ago and can outperform larger models (see an example here).
We can probably create a new SOTA NLI model with the new DeBERTa version and enough NLI data.
Datasets
Established NLI datasets include:
MultiNLI
SNLI
ANLI
Other interesting NLI datasets include:
FEVER-NLI
DocNLI
LingNLI
More datasets can be included!
Challenges
- NLI models can be trained as either 3-class classifiers (entailment/neutral/contradiction) or as 2-class classifiers (entailment/not_entailment). Both setups have different advantages and disadvantages
- There is a lot of NLI data (2 mio++ texts in the datasets linked above), which makes training computationally expensive. Optimising the training pipeline is a challenge.
- Many different datasets can be translated into NLI-format. Including more datasets can be beneficial, but requires manual transformation of datasets.
Desired project outcomes
- Create a Streamlit or Gradio app on Spaces that provides an interface for zero-shot classification with a new NLI model in the backend.
Additional resources
See the links to the datasets above. Also see Joe Davidson’s original blog post on the zeroshot pipeline
Discord channel
To chat and organise with other people interested in this project, head over to our Discord and:
-
Follow the instructions on the
#join-course
channel -
Join the
#zero-shot
channel
Just make sure you comment here to indicate that you’ll be contributing to this project