[RFC] Transformers Pipeline v2

Hi,

Just an aparté on ONNX inference (cross-post from Supporting ONNX optimized models)

I’d be interested in an nlp = pipeline("sentiment-analysis", onnx=True) pipeline like @valhalla created, where the ONNX files are hosted on the model hub and stored in the transformers cache.

My use case is fast inference of pre-trained models on embedded applications (no network connection).

Cheers,

Alex