How to export mT5 model to onnx/torchscript and use it?

I have fine tuned a model named “tzq0301/mT5-news-title-generation” on the hub based on “csebuetnlp/mT5_multilingual_XLSum”.

I can generate my “summarization” using model.generate(...) but I don’t know how to generate “summarization” using model(...) or model.__call__(...) after exporting my model as onnx or torchscript.

I can get input_ids and attention_mask using tokenizer(...).

And I can get Seq2SeqLMOutput using model(...), which contains:

  • loss=None, logits=tensor(...)
  • past_key_values=tuple(tuple(tensor(...)))
  • decoder_hidden_states=None
  • decoder_attentions=None
  • cross_attentions=None
  • encoder_last_hidden_state=tensor(...)
  • encoder_hidden_states=None
  • encoder_attentions=None

I don’t know how to use and which to use for my “summarization” generation.