Zero shot classification and Onnx

I want to speed up the inference on zero-shot-classification. I am planning to use facebook/bart-large-mnli model. I want to use onnx to speed it up. The official docs all contain examples on some ‘bert’ model and onnx_transformers is not longer upto date.

When i tried to export the model according to this article, I used the following code

python -m transformers.onnx --model=facebook/bart-large-mnli --features=sequence-classification --atol=1e-4 onnx_zer_shot/

Now the problem is in giving the inputs to – when i manually run HF model – without using pipeline, i use the following input

x = tokenizer.encode(premise, hypothesis, return_tensors='pt', truncation_strategy="only_first")

But this is not working for onnx session run – it expects a dictionary with [‘attention_mask’, ‘input_ids’] keys – which is the same as the HF model but HF model works with the encoded string anyway.
NOTE: nli_model is nli_model = AutoModelForSequenceClassification.from_pretrained("facebook/bart-large-mnli")

Any help is apreciated

@lysandre @joeddav pls help

I have same question.