Understanding zero-shot classification in one-shot ;-)

Hello there!

I LOVE the zero-shot classification API. When you think about it, it could also be used as search engine (find the documents that are related to a particular topic).

My question is: conceptually, how does it work? I saw the original post New pipeline for zero-shot text classification but I cannot find anything that explains the model under the hood.

My understanding is that

  1. the API gets the embedding for the input sequence
  2. the API computes the embedding for the possible labels
  3. a similarity measure is computed between the input sequence and each of the labels and - via a softmax - a probability over the topics is then given

Is that correct?
Thanks and keep up the good work!

1 Like

hey @olaffson, conceptually i believe the zero-shot classification pipeline works by adapting a task like natural language inference, where the language model is provided with a “template” like

"<some text you want to classify>. This text is about <MASK>" 

and the model fills in the most probable label given the context. joe discusses this in this section of his blog post and you can find one of the underlying zero-shot models here

1 Like

thanks @lewtun, I will have a look at the sources

Glad you liked it :slight_smile: I wrote a blog post about it last year, check out this section for a conceptual explanation.