New pipeline for zero-shot text classification

bennicholl · November 9, 2022, 5:13am

How’s it going?

I’m getting different entailment probabilities for when I use pipeline vs when I don’t use pipeline. Below code pertains to me not using pipeline

from transformers import AutoModelForSequenceClassification, AutoTokenizer
nli_model = AutoModelForSequenceClassification.from_pretrained('facebook/bart-large-mnli')
tokenizer = AutoTokenizer.from_pretrained('facebook/bart-large-mnli')

premise = 'claude giroux played for the flyers'
hypothesis = 'hockey'

# run through model pre-trained on MNLI
x = tokenizer.encode(premise, hypothesis, return_tensors='pt',
                     truncation_strategy='only_first')
logits = nli_model(x)[0]

# we throw away "neutral" (dim 1) and take the probability of
# "entailment" (2) as the probability of the label being true 
entail_contradiction_logits = logits[:,[0,2]]
probs = entail_contradiction_logits.softmax(dim=1)


print('probs not using pipeline:', probs)

and my probabilities are
[[0.2292, 0.7708]]
where 0.7708 corresponds to entailment

Below code uses pipeline

from transformers import pipeline

premise = 'claude giroux played for the flyers'
hypothesis = 'hockey'

classifier = pipeline("zero-shot-classification",
                      model="facebook/bart-large-mnli")

preds = classifier(premise, hypothesis, multi_label = False)

print('probs using pipeline:', preds)

and my entailment probability is [0.10335938632488251]

I also get the same probability above when multi_label is set to True

Is there a reason for the differing probabilities?

Kirti · December 15, 2022, 4:24pm

This is interesting idea but I find it too slow. Imagine if I have very long text and 100 different labels then it will encode text 100 times with different labels.

and If I have 10k long sentences then it will will be very very slow.

I like your first approach in the blog where we encode 10k sentences and labels only once and then get label using cosine similarity. That would be much faster.

What do you think?

m-ali-awan · January 4, 2023, 4:53pm

Hi all,

I want to do zero-shot text classification for automotive parts, an candidate labels are around 3200. Now this needs to be done on basis of the Description.
Using pretrained Zero-Shot models like Bart-mnli etc, are not giving me good results, as they are not having much context knowledge. How can I finetune, ZeroShot model on all descriptions corpus, as I think this will improve results a lot. I saw this approach in ULMFit by Fastai, as they train the Language model encoder by training to predict the next word, for the whole corpus. And then they use that encoder as backbone for text classifier, and once that is fine-tuned, results are better.

Thanks, for any help. Kindly share me, any notebook, or blog, which can help me implement this…

ryuno25 · January 7, 2023, 4:27pm

I am currently doing inference using valhalla/distilbart-mnli-12-1 and with 30 possible candidate labels on about 70k datapoints. To get the label, I am currently using batching and running this.
for out in tqdm.tqdm(classifier(KeyDataset(train_dataset, “input”), candidate_labels=list_of_topics, batch_size = 256)):

Running on colabs GPU, it looks like it will take about 7 hours. Is this expected and is there anything I can do to speed it up?

lsimoneau · June 2, 2023, 2:42am

Hi, I saw this statement and am curious if this is still true? The latency of the pipeline in our testing seems to scale linearly with the number of labels, but if I recreate a simplified version of it where I batch each sequence pair into a single forward pass it’s substantially faster. Has this automatic label-wise batching been removed from the pipeline since you originally posted this answer?

aarabil · April 8, 2024, 4:59pm

I am wondering how to best model the scenario where I want a binary classifier with the positive class having multiple labels (e.g, this article is about sports OR politics OR science). Should I use one meta-label (“sports or politics or science”) or should I use three separate labels and sum up the probabilities?

aarabil · June 16, 2024, 1:00pm

any updates on this?

codelion · February 17, 2025, 11:30pm

For zero-shot classification with ability to update classes and labels on the fly, you may want to try out adaptive-classifier - GitHub - codelion/adaptive-classifier: A flexible, adaptive classification system for dynamic text classification

Topic		Replies	Views
Zero shot classification with manual pytorch Beginners	0	719	August 27, 2021
Project: Create a new zero-shot model with NLI data 🤗 Course Projects	9	3649	April 11, 2023
Zero shot classification pipeline customization Intermediate	2	1748	April 27, 2022
Fine tune Zero-shot classification on multi-label dataset Models	4	3538	November 30, 2023
Model for Text Classification similar to bart-large-mnli, for TensorFlow Beginners	0	494	May 6, 2022

New pipeline for zero-shot text classification

Related topics