way to make inference Zero Shot pipeline faster?

joeddav · October 6, 2020, 1:39pm

There’s some discussion in this topic that you could check out.

Here are a few things you can do:

Try out one of the community-uploaded distilled models on the hub (thx @valhalla) . I’ve found them to get pretty similar performance on zero shot classification and some of them are much smaller and faster. I’d start with valhalla/distilbart-mnli-12-3 (models can be specified by passing e.g. pipeline("zero-shot-classification", model="valhalla/distilbart-mnli-12-3") when you construct a model.
If you’re on GPU, make sure you’re passing device=0 to the pipeline factory in to utilize cuda.
If you’re on CPU, try running the pipeline with ONNX Runtime. You should get a boost. Here’s a project (thx again @valhalla) that lets you use HF pipelines with ORT automatically.
If you have a lot of candidate labels, try to get clever about passing just the most likely ones to the pipeline. Passing a large # of labels for each sentence is really going to slow you down since each sentence/label pair has to be passed to the model together. If you have 100 possible labels but you can use some kind of heuristic or simpler model to narrow it down, that will help a lot.
Use mixed precision. This is pretty easy if using PyTorch 1.6.

Topic		Replies	Views
Batched pipeline inference has little speed improvement on longer texts Beginners	1	1880	October 27, 2023
Speeding up zero shot classification [Solved] Beginners	5	6028	September 9, 2020
What's the best way to speed up inference on a large dataset? Beginners	3	3900	March 13, 2022
Inference using Pipeline and TensorFlow Beginners	0	497	December 2, 2021
The most efficient way for predictions(zero-shot classification) on huge dataset Beginners	0	526	July 6, 2022