Speeding up zero shot classification [Solved]

sachin · August 12, 2020, 7:58am

Hi,

I was wondering if there was a way to speed up zero shot classification as outlined here if I was to use pytorch directly.

For example I’m guessing this default method tokenises and pads to length 512 whereas most of my text is < 50 words. I’ve had some experience in using BertWordPieceTokenizer. So I’m guessing it would also be faster to tokenize everything in one go and send it to a pytorch model directly, rather than one by one which is what I’m guessing is happening here?

Would really appreciate even a starting point if such a thing is possible.

joeddav · August 12, 2020, 5:11pm

By default it actually pads to the length of the longest sequence in the batch, so that part is efficient. The thing to keep in mind with this method is that each sequence/label pair has to be fed through the model together. So if you are running with a large # of candidate labels, that’s going to be your bottleneck. The other thing is that the default model, bart-large-mnli, is pretty big. Theoretically, the pipeline can be used with any model that has been trained on NLI, which you can specify with the model parameter when you create the pipeline. So you could try out some smaller models, but you probably won’t get anything to work as well as Bart or Roberta in terms of accuracy.

sachin · August 13, 2020, 5:50am

Thanks for the quick response @joeddav. I tried the following:

from transformers import AutoTokenizer, AutoModel

tokenizer = AutoTokenizer.from_pretrained("facebook/bart-large-mnli")
model = AutoModel.from_pretrained("facebook/bart-large-mnli")

sequence = "Who are you voting for in 2020?"
template = "This example is {}."
x = tokenizer(sequence, template.format("politics"), return_tensors="pt")
y = model(**x)
print([a.shape for a in y])
# [torch.Size([1, 17, 1024]), torch.Size([1, 17, 1024])]

whereas I was expecting the output to be of size (1, 3) giving the logits/ probabilities of entailment etc.

So seems like the model I got was a headless model? Is there a way to get the model with the head.

I see your point about max length being set, but considering my classes are always the same, I probably don’t need to tokenise that all the time. I could in theory tokenise the inputs and the classes separately and join them with a seperatory. Atleast this is what I was hoping to do with above code segment.

sachin · August 13, 2020, 7:33am

whoops, figured out that I had to use AutoModelForSequenceClassification instead. All good, everything is solved.

joeddav · September 9, 2020, 3:30pm

Oh, I also realized I should add here for any readers that if you want to use pipelines with GPU, you can just pass device=0 where 0 is the device number, to the pipeline factory:

classifier = pipeline("zero-shot-classification", device=0)

We should be updating this to automatically use GPU soon.

valhalla · September 9, 2020, 3:35pm

Or you could try this project onnx_transformers, which let’s you speed up HF pipelines using onnx and also includes zero-shot-classification. Note that BART is not tested on ONNX yet, so it uses roberta-large-mnli instead of BART

Topic		Replies	Views
Zero shot classification with manual pytorch Beginners	0	732	August 27, 2021
Zero-Shot Classification Pipeline - Truncating Beginners	4	1172	May 27, 2021
Alternative approaches for text classification task 🤗Transformers	0	431	October 25, 2022
Fine tune model='facebook/bart-large-mnli' Intermediate	0	1277	May 16, 2022
How to scale Zero Shot Pipeline in large datasets? 🤗Transformers	0	231	August 27, 2021

Speeding up zero shot classification [Solved]

Related topics