Support for exporting generate function to ONNX?

I was wondering if huggingface provided any support for exporting the generate function from the transformers library to ONNX?

Mainly, I was trying to create an ONNX model using a GPT2 style transformer in order to speed up inference when generating replies to a conversation.

I see there’s some support for exporting a single call to GPT2, but not the entire for loop used in greedy decoding/beam search/nucleus sampling etc.

Take a look at huggingface optimum, but not all models are supported. Documentation.

Thanks for the response @guillermogabrielli. I’ve already taken a cursory look at optimum, however all the export scenarios seem pretty targeted. I could not find a way to export an LM + generalized decoder.

Hi @nifarn , could you elaborate on what you would like to see that is not available in Optimum for encoder-decoder models?