New pipeline for zero-shot text classification

joeddav · August 24, 2020, 3:35pm

Thanks for the helpful answer, @rsk97. Let me just add a bit:

I discuss this briefly in my blog post under Classification as Natural Language Inference -> When Some Annotated Data is Available. In short, if you have a limited amount of labeled data, you can further fine-tune the pre-trained NLI model. Pass the true label for a given sequence in the same way as you would during inference, e.g. <cls> Who are you voting for in 2020 ? <sep> This text is about politics . <sep>, and calculate the loss as if you were doing NLI with the true label set to entailment. You should also pass an equal number of sequences with a randomly selected false label, such as <cls> Who are you voting for in 2020 ? <sep> This text is about sports . <sep>. For these fictitious example, the target label should be set to contradiction. This method will work a little bit if you have a small amount of labeled data, but it will really excel if you have a large amount of data for some of your labels and only a small amount of data (or no data) for other labels.
We also have a bunch of ready-trained sentiment classifiers in the model hub. Use one of those out of the box, or fine-tune it further on your particular dataset.

This brings up a good point that the zero shot classification pipeline should only be used in the absence of labeled data or when fine-tuning a model is not feasible. If you have a good set of labeled data and you are able to fine-tune a model, you should fine-tune a model. You will get better performance and at a lower computational cost.

Topic		Replies	Views
Zero shot classification with manual pytorch Beginners	0	719	August 27, 2021
Project: Create a new zero-shot model with NLI data 🤗 Course Projects	9	3649	April 11, 2023
Zero shot classification pipeline customization Intermediate	2	1748	April 27, 2022
Fine tune Zero-shot classification on multi-label dataset Models	4	3539	November 30, 2023
Model for Text Classification similar to bart-large-mnli, for TensorFlow Beginners	0	494	May 6, 2022

New pipeline for zero-shot text classification

Related topics