New pipeline for zero-shot text classification

m2rik · September 14, 2020, 6:29pm

Yes. It worked! just 44 secs for 2500 rows. Thank you!

kevinyauris · September 18, 2020, 8:44am

Hi @valhalla, thanks for developing the onnx_transformers. I have tried it with zero-shot-classification pipeline and do a benchmark between using onnx and just using pytorch, following the benchmark_pipelines notebook. I tried several SageMaker instances with various numbers of cores and CPU types. It seems that using an instance that has more CPU core will give more speed-up when but using an instance with more cores is more expensive and at a certain level, the price is almost the same as using a GPU.
I wonder if there are other ways to speed things up while keep the cost minimal. I found that quantization may help but it seems that onnx_transformers doesn’t support onnx quantize yet. Do you have plan to support it? Can you kindly give me some reference to use onnx quantize with zero-shot-classification pipeline (with or without using onnx_transformers)?
Thanks in advance!

jvence · September 18, 2020, 10:21am

Trying to run on a large dataset using 12 labels with no success. I’ve asked the question on StackOverflow:

My concern is that I keep running out of memory using 57K sentences (read from CSV and fed to the classifier as a list). I’m assuming there’s a way to batch process this by perhaps using a dataset. Any recommendations?

UPDATE:
tried using the GPU on Colab: classifier = pipeline(‘zero-shot-classification’, device=0)
and got:
RuntimeError: CUDA out of memory. Tried to allocate 812.01 GiB (GPU 0; 15.90 GiB total capacity; 6.67 GiB already allocated; 6.94 GiB free; 8.09 GiB reserved in total by PyTorch)
Notice how it goes from 6GiB to 812?

jvence · September 18, 2020, 11:03am

Another question:
I ran the model using pipeline() and got great results:

while using the manual approach described at Zero-Shot Learning in Modern NLP | Joe Davison Blog using
tokenizer = AutoTokenizer.from_pretrained(‘facebook/bart-large-mnli’)
model = AutoModel.from_pretrained(‘facebook/bart-large-mnli’)

yields different (and terrible) results on the same sentence and labels:

Am I missing something? I’m assuming this is happening because it’s not using Multi-classes.

joeddav · September 21, 2020, 3:00pm

Happy to take a look at your code if you don’t mind posting the snippet.

As for your memory errors, the current pipeline implem doesn’t do any mini-batching for you, so you’re trying to run the whole dataset through a large transformer in one pass which would require an incredible amount of memory. We’ll hopefully have auto batching with the upcoming pipelines revamp, but in the meantime just pass each sequence (or a handful of sequences) to the model in a separate call rather than passing the whole dataset as one list.

valhalla · September 21, 2020, 4:52pm

Hey people, seems like lot of you are interested in speeding up zero shot . I tried one quick experiment using no teacher bart distillation for mnli and achieved impressive scores for MNLI, very small drop in metrics.

You can try out those models and see if that give similar/good-enough accuracy for zero-shot , they are faster than bart-large-mnli.

All models are available on hub
Repo if you want to try it out yourself.

Thoughts, suggestions welcome

cc @joeddav

charly · October 16, 2020, 4:53pm

This is fantastic!

Also excited to post for the first time on the Hugging Face forum!

@joeddav: I’ve tried to use that Zero-shot pipeline with a dataframe and default params. While results are accurate, I can’t seem to iterate over 50 rows without Colab to crash due to lack of RAM.

Is this expected? Happy to share the code if you like

Thanks in advance!

Best,
Charly

cc: @valhalla

valhalla · October 16, 2020, 5:15pm

Hi @charly, please share code, will be happy to take a look!

charly · October 17, 2020, 11:36am

Thanks @valhalla for your help, much appreciated!

I finally managed to make it faster by switching to the GPU settings

I’m now facing another issue as I’m trying to deploy that Zero-shot code to Heroku or Streamlit Sharing (Streamlit’s new hosting service), but the app doesn’t work once deployed.

As specified here, I’ve added these lines in my main py file:

from transformers import AutoTokenizer, AutoModel

tokenizer = AutoTokenizer.from_pretrained("facebook/bart-large-mnli")

model = AutoModel.from_pretrained("facebook/bart-large-mnli")

Here’s the error message I get:

OSError: Can't load weights for 'facebook/bart-large-mnli'. Make sure that: - 'facebook/bart-large-mnli' is a correct model identifier listed on 'https://huggingface.co/models' - or 'facebook/bart-large-mnli' is the correct path to a directory containing a file named one of pytorch_model.bin, tf_model.h5, model.ckpt.

… So I’m not sure if I’m doing this properly.

Thanks in advance
Charly

valhalla · October 19, 2020, 3:23pm

Hi @charly, are still facing this issue ?

charly · October 19, 2020, 3:38pm

Thanks for asking @valhalla!

Yes, I still have the issue. Let me retry this week - I’ll keep you posted!

joeddav · October 19, 2020, 4:10pm

That’s strange. What version of transformers are you using? I believe Bart used to be a “canonical” model in the library before it was moved to the facebook org, so try just doing bart-large-mnli without the facebook/ if you’re using an older version of transformers.

Btw, you’ll want to use AutoModelForSequenceClassification rather than AutoModel so that you get the NLI output layer and not just the encoder.

charly · October 19, 2020, 7:57pm

Thanks @joeddav,

I’ve finally managed to get it working, although not yet in a GPU set-up

By the way, when I remove facebook in facebook/bart-large-mnli, as follows:

I’ve got the following issue:

Thanks,
Charly

philka-ua · October 23, 2020, 1:06pm

Can you describe in detail what is better to use AutoModelForSequenceClassification or AutoModel to get more true predictions from this approach?

nayid · October 29, 2020, 4:54pm

Hi @m2rik I’m curious to see your implementation, I tried the following approach with a GPU in Google Colab: for 20,000 rows (just one sentence per row) and 15 classes, it took 56 minutes.

zsc = pipeline(task='zero-shot-classification', tokenizer=tokenizer, model=model, device=0)

batch_size = 128 
sequences = df['idea'].to_list()
list_of_ideas = []
for i in range(0, len(sequences), batch_size):
    list_of_ideas += zsc(sequences[i:i+batch_size], candidate_labels=candidate_labels, multi_class=True)

–
CPU times: user 31min 20s, sys: 24min 38s, total: 55min 58s
Wall time: 55min 58s

Any help is really appreciated.
Thanks.

joeddav · October 29, 2020, 5:50pm

@nayid You may have seen this already, but I’d use the distilled valhalla/distilbart-mnli-12-3 instead of the default model if you’re trying to speed things up. You should get a good boost in speed/memory and it seems to have similar accuracy.

nayid · October 29, 2020, 8:01pm

@joeddav thank you!!
Great speed improvement, almost half the time.

CPU times: user 18min 4s, sys: 13min 37s, total: 31min 41s
Wall time: 31min 42s

valhalla · October 31, 2020, 2:45pm

No teacher distillation is really effective!
The distillation paper is out if anyone is interested.

ouerfelli13 · November 1, 2020, 6:17pm

Hello , i want to use the pipeline on my pc with docker but the containers is KILLED , you have a solution please? @joeddav

Recreating classification_api_1 … done Attaching to classification_api_1
api_1 | /app/venv/lib/python3.7/site-packages/torch/cuda/init.py:52: UserWarning: CUDA initialization: Found no NVIDIA driver on your system. Please check that you have an NVIDIA GPU and installed a driver from http://www.nvidia.com/Download/index.aspx (Triggered internally at /pytorch/c10/cuda/CUDAFunctions.cpp:100.)
api_1 | return torch._C._cuda_getDeviceCount() > 0
Downloading: 100%|██████████| 908/908 [00:00<00:00, 205kB/s]
Downloading: 100%|██████████| 1.63G/1.63G [10:19<00:00, 2.63MB/s]s]
api_1 | Some weights of the model checkpoint at facebook/bart-large-mnli were not used when initializing BartModel: [‘model.encoder.version’, ‘model.decoder.version’]
api_1 | - This IS expected if you are initializing BartModel from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPretraining model).
api_1 | - This IS NOT expected if you are initializing BartModel from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
Downloading: 100%|██████████| 899k/899k [00:01<00:00, 834kB/s]
Downloading: 100%|██████████| 456k/456k [00:00<00:00, 658kB/s]Killed

joeddav · November 2, 2020, 4:32pm

Looks like you’re trying to use cuda without GPU / proper cuda installation?

Topic		Replies	Views
Zero shot classification with manual pytorch Beginners	0	720	August 27, 2021
Project: Create a new zero-shot model with NLI data 🤗 Course Projects	9	3652	April 11, 2023
Zero shot classification pipeline customization Intermediate	2	1754	April 27, 2022
Fine tune Zero-shot classification on multi-label dataset Models	4	3569	November 30, 2023
Model for Text Classification similar to bart-large-mnli, for TensorFlow Beginners	0	494	May 6, 2022

New pipeline for zero-shot text classification

Related topics