Thanks for the helpful answer, @rsk97. Let me just add a bit:
-
I discuss this briefly in my blog post under
Classification as Natural Language Inference -> When Some Annotated Data is Available
. In short, if you have a limited amount of labeled data, you can further fine-tune the pre-trained NLI model. Pass the true label for a given sequence in the same way as you would during inference, e.g.<cls> Who are you voting for in 2020 ? <sep> This text is about politics . <sep>
, and calculate the loss as if you were doing NLI with the true label set toentailment
. You should also pass an equal number of sequences with a randomly selected false label, such as<cls> Who are you voting for in 2020 ? <sep> This text is about sports . <sep>
. For these fictitious example, the target label should be set tocontradiction
. This method will work a little bit if you have a small amount of labeled data, but it will really excel if you have a large amount of data for some of your labels and only a small amount of data (or no data) for other labels. -
We also have a bunch of ready-trained sentiment classifiers in the model hub. Use one of those out of the box, or fine-tune it further on your particular dataset.
This brings up a good point that the zero shot classification pipeline should only be used in the absence of labeled data or when fine-tuning a model is not feasible. If you have a good set of labeled data and you are able to fine-tune a model, you should fine-tune a model. You will get better performance and at a lower computational cost.