How to make `pipeline` automatically scale?

I see, thanks. I think what I need are optimizations like ONNX Runtime, quantization, etc.

The only problem I have is that the HF ONNX converter can’t convert multi-label sequence classification models yet, AFAIK. Is it planned for a future release?