I see, thanks. I think what I need are optimizations like ONNX Runtime, quantization, etc.
The only problem I have is that the HF ONNX converter can’t convert multi-label sequence classification models yet, AFAIK. Is it planned for a future release?