Understanding zero-shot classification in one-shot ;-)

olaffson · July 30, 2021, 3:39pm

Hello there!

I LOVE the zero-shot classification API. When you think about it, it could also be used as search engine (find the documents that are related to a particular topic).

My question is: conceptually, how does it work? I saw the original post New pipeline for zero-shot text classification but I cannot find anything that explains the model under the hood.

My understanding is that

the API gets the embedding for the input sequence
the API computes the embedding for the possible labels
a similarity measure is computed between the input sequence and each of the labels and - via a softmax - a probability over the topics is then given

Is that correct?
Thanks and keep up the good work!

lewtun · July 30, 2021, 10:41pm

hey @olaffson, conceptually i believe the zero-shot classification pipeline works by adapting a task like natural language inference, where the language model is provided with a “template” like

"<some text you want to classify>. This text is about <MASK>"

and the model fills in the most probable label given the context. joe discusses this in this section of his blog post and you can find one of the underlying zero-shot models here

olaffson · July 31, 2021, 4:37pm

thanks @lewtun, I will have a look at the sources

joeddav · August 2, 2021, 5:39pm

Glad you liked it I wrote a blog post about it last year, check out this section for a conceptual explanation.

Topic		Replies	Views
Class prediction in a zero/few-shot setting at inference time Beginners	0	401	July 27, 2022
New pipeline for zero-shot text classification 🤗Transformers	107	71687	February 17, 2025
Project: Create a new zero-shot model with NLI data 🤗 Course Projects	9	3653	April 11, 2023
Zero shot classification pipeline customization Intermediate	2	1757	April 27, 2022
Zero shot classification for long form text Beginners	4	605	July 15, 2024

Understanding zero-shot classification in one-shot ;-)

Related topics