Language generation with torchscript model?

Has anyone figured out a way to run inference (same as the .generate() method) for seq-to-seq models on Elastic Inference?
I am trying to run inference for Seq-to-Seq models (like BART, Pegasus) on Elastic Inference with EC2.
So far, I have been able to use the TorchScript example (1) and store the model, but unable to figure out how to run the inference on it.

(1) Export to ONNX