New pipeline for zero-shot text classification

ouerfelli13 · November 14, 2020, 11:21am

Hello @joeddav , please how can i train zero-shot classification pipeline on my own dataset because i get errors in classification of some texts , so i want to train this pipeline on my data set , thank you

MoritzLaurer · November 15, 2020, 5:30pm

Thanks so much for creating this great pipeline! I’ve been experimenting with NLI for zero-shot classification and it’s really fascinating.

Could you explain a bit more the theoretical or empirical reasons for disregarding the logit for the neutral label? I imagine that you are doing this because otherwise the model would be oversensitive for classifying too many things as neutral (that’s what’s happening in my experiments). At the same time I feel a bit uneasy about simply ignoring this entire label and it sometimes leads to the model classifying something as ‘entailed’ too easily.

I read Yin et al. 2019 who you quote in your blog post and I noticed that they write:

We convert all datasets into binary case: “entailment” vs. “non-entailment”, by changing the label “neutral” (if exist in some datasets) into “non-entailment”.

So for their experiments they merge ‘contradiction’ and ‘neutral’ into the same category (‘non-entailment’) in the different NLI datasets even before training. Then they train their base model (BERT or which ever) on these new binary NLI datasets. This means that they then only do softmax on the two logits for entailment or non-entailment (if I understand correctly) and they don’t have to disregard a third label because there is non.

I’m wondering if:

I understood this correctly?
You think that this leads to a meaningful difference in performance?
There are other theoretical or empirical reasons why it’s fine to simply keep and ignore the neutral label?

(Another, unrelated thought: When I switched from BART-mnli to roberta-large-snli_mnli_fever_anli_R1_R2_R3-nli I got good performance boosts in my small experiments, maybe that could also be useful for your pipeline. It’s great that these SOTA models are freely available via the Hugging Face model hub, so thanks again )

ijauregi · December 7, 2020, 2:47am

Is there an easy way to run the inference on multiple GPUs?

joeddav · December 8, 2020, 3:11pm

Not at the moment, but hopefully in the not-too-distant-future.

charly · December 15, 2020, 12:50pm

Hi Joe,

Quick question!

I’ve created a Streamlit app that leverages that zero shot classification algorithm. The app iterates over dataframes to categorise each row

I’d like to deploy on CPU instances (rather than GPUs) to save on costs (heck, these are personal projects! )

So, it may be a rather noob question, yet I was wondering if there was any way to boost speed on the CPU setting. At the moment, everything is terrbly slow when I try to use the app on my local (No GPU!) machine.

Any guidance would be much appreciated!

Thanks,
Charly

joeddav · December 15, 2020, 2:37pm

Hey @charly, here’s a previous thread about that. The main tricks are going to be:

Use one of these distilled models which are smaller and faster but with similar results
Run with the ONNX Runtime. One way you can do this is with this project created by @valhalla before he joined Hugging Face
If you have long sequences you’re classifying, you can try truncating to just part of the sequence. That’ll give you a speedup but you’ll have to evaluate how it impacts your performance.
If you have a large # of candidate labels, try to come up with a heuristic or use a super lightweight classifier to identify the most likely candidates, and then just feed in those more likely candidates rather than all of them.

Btw if it’s public would you mind linking to your streamlit app? It’s always fun to see the ways that people are using it

charly · December 15, 2020, 4:02pm

Thanks Joe!

Quick a few things to try out, that’s exciting!

And yes, I’ll defintely share the app here as soon as it runs smoothly enough!

On a side note, I’ve tried the ONNX code you suggested in Colab and it threw the following issues when trying to import from onnx_transformers import pipeline

ModuleNotFoundError: No module named 'transformers.configuration_auto'

then if I try to downgrade to transformers==2.5.1 as suggested here, I’ve got another issue:

No module named 'transformers.convert_graph_to_onnx'

have you come across this issue before?

Thanks
Charly

joeddav · December 15, 2020, 4:04pm

Hmm not sure. Maybe @valhalla would know?

valhalla · December 15, 2020, 4:22pm

Hi @charly

could you try with transformers v3.* , haven’t tested it with v4.*

valhalla · December 15, 2020, 4:29pm

BTW @joeddav, have you evaluated the distilled models on zero-shot ? Would love to know the metrics

joeddav · December 15, 2020, 4:41pm

Hey, thanks for the comment. It’s a good question. I think in the multi-class multi-label case it does make sense to include the neutral label rather than throw it out, and you’re right that that is what Yin et al do. I ran a quick experiment on GoEmotions, which is a multi-label emotion classification corpus. When I modify the code to include the neutral label, this lowered the recall and increased the precision (as you’d expect) with a boost in the overall F1. Maybe we can add a ignore_neutral argument with the default as True for now with a warning that it will change to False in the future.

I should also note that in the single-label case the pipeline ignores both neutral and contradiction and only does the softmax over the entailment dim. This might yield similar unease but so far seems to empirically work best.

joeddav · December 15, 2020, 4:46pm

I have, but only on AG’s News, which is a pretty easy topic classification dataset with only 4 classes so the results are far from conclusive. Scores are accuracy.

facebook/bart-large-mnli: 0.6886842105263158
valhalla/distilbart-mnli-12-1: 0.7248684210526316
valhalla/distilbart-mnli-12-3: 0.6981578947368421
valhalla/distilbart-mnli-12-6: 0.7277631578947369
valhalla/distilbart-mnli-12-9: 0.689078947368421

Kinda funny how the smallest one did several points better than the original, but again it’s just one easy/small dataset so I don’t think we can say much from it.

rockdrigoma · December 15, 2020, 8:25pm

I’m using the model deployed by a Flask API (just one instance of the model is being used by all requests), but when I send two requests at the same time, I receive an error, same as in this issue on Github https://github.com/huggingface/tokenizers/issues/537:

RuntimeError:Already borrowed
https://github.com/huggingface/tokenizers/blob/598ce61229c789465966682687fa12a90ec58074/bindings/python/py_src/tokenizers/implementations/base_tokenizer.py#L107-L123

model = pipeline('zero-shot-classification', model='joeddav/xlm-roberta-large-xnli', device=0)
model(sequence_to_classify, candidate_labels, hypothesis_template=hypothesis_template)

According to the issue on Github, I need to use a different tokenizer for each request but AFAIK this tokenizer is linked to the model, so how can I make it work?

Github suggestion:

I think the easiest way to fix it, for now, will be to ensure you have an instance of the tokenizer for each thread

joeddav · December 15, 2020, 9:13pm

I suspect that’s something to do with the rust backend with our fast tokenizers. Try passing use_fast=False when you call pipeline.

charly · December 15, 2020, 9:53pm

Thanks @valhalla! It’s worked w. transformers v3!

New issue arose, however, when trying to run the onnx transformers on my local machine (CPU, dell latitude 7490 windows 10 x64)

InvalidArgument: [ONNXRuntimeError] : 2 : INVALID_ARGUMENT : Unexpected input data type

for the following line:

File "C:\Users\Desktop\OneShotClass\ONNX_CPU.py", line 13, in <module>
    classifier = pipeline("zero-shot-classification", onnx=True)

Also FYI, here’s the code:

import pandas as pd
import numpy as np
import streamlit as st
from onnx_transformers import pipeline
classifier = pipeline("zero-shot-classification", onnx=True)

Have you guys come across that issue before?

cc @joeddav

Thanks,
Charly

valhalla · December 16, 2020, 5:57am

could you post the full stack-trace ?

charly · December 16, 2020, 9:03am

Sure, here it is:

InvalidArgument: [ONNXRuntimeError] : 2 : INVALID_ARGUMENT : Unexpected input data type
Traceback:
File "c:\users\charly\desktop\onnx-cpu-zero-shot-class-test\venv\lib\site-packages\streamlit\script_runner.py", line 332, in _run_script
    exec(code, module.__dict__)
File "C:\Users\Charly\Desktop\ONNX-CPU-Zero-Shot-Class-Test\app.py", line 5, in <module>
    classifier = pipeline("zero-shot-classification", onnx=True)
File "c:\users\charly\desktop\onnx-cpu-zero-shot-class-test\venv\lib\site-packages\onnx_transformers\pipelines.py", line 1771, in pipeline
    return task_class(
File "c:\users\charly\desktop\onnx-cpu-zero-shot-class-test\venv\lib\site-packages\onnx_transformers\pipelines.py", line 925, in __init__
    super().__init__(*args, args_parser=args_parser, **kwargs)
File "c:\users\charly\desktop\onnx-cpu-zero-shot-class-test\venv\lib\site-packages\onnx_transformers\pipelines.py", line 559, in __init__
    self._warup_onnx_graph()
File "c:\users\charly\desktop\onnx-cpu-zero-shot-class-test\venv\lib\site-packages\onnx_transformers\pipelines.py", line 730, in _warup_onnx_graph
    self._forward_onnx(model_inputs)
File "c:\users\charly\desktop\onnx-cpu-zero-shot-class-test\venv\lib\site-packages\onnx_transformers\pipelines.py", line 724, in _forward_onnx
    predictions = self.onnx_model.run(None, inputs_onnx)
File "c:\users\charly\desktop\onnx-cpu-zero-shot-class-test\venv\lib\site-packages\onnxruntime\capi\onnxruntime_inference_collection.py", line 124, in run
    return self._sess.run(output_names, input_feed, run_options)

Thanks,
Charly

lore-26 · January 8, 2021, 5:52pm

Hi guys,
Not sure if anybody have my same problem but I don’t see any difference in speed when batching versus passing a single input. It is often the opposite, inferring an example at the time is faster…
I have 10 classes and it is taking on average around 3 seconds per prediction.
I would like to batch multiple input in order to reduce the latency.

This is the code:

classifier = pipeline("zero-shot-classification", model=f'valhalla/distilbart-mnli-12-3', device=0)
classifier(batch, categories, multi_class=True)

Any thought on this?
Thank you!

joeddav · February 17, 2021, 4:01pm

I’m not sure why you’re having that poor of latency, but one thing to keep in mind is if you feed N sequences and K classes through the pipeline, the true batch size is not N but N\cdot K. This is because every seq/label pair has to be fed through the model separately. So when you increase the size of your batch from 1 to 10 but have K=10, you’re actually increasing the true batch size from 10 to 100, not 1 to 10. Thus the relatively insignificant speedup. Batching is happening under the hood even with N=1.

That said, 3 seconds per prediction definitely seems too slow on GPU esp. with that model. Did you ever figure out what was causing the latency?

ankit · February 27, 2021, 12:58am

Can DataParallel module be used for parallelizing for multi-Gpu? DataParallel — PyTorch master documentation

Topic		Replies	Views
Alternative approaches for text classification task 🤗Transformers	0	430	October 25, 2022
Zero shot classification with manual pytorch Beginners	0	730	August 27, 2021
How to scale Zero Shot Pipeline in large datasets? 🤗Transformers	0	226	August 27, 2021
Model for Text Classification similar to bart-large-mnli, for TensorFlow Beginners	0	499	May 6, 2022
Speeding up zero shot classification [Solved] Beginners	5	6092	September 9, 2020

New pipeline for zero-shot text classification

Related topics