Gpt2 inference with onnx and quantize

lewtun · February 1, 2021, 8:42am

Hi @yanagar25 when you say you cannot run the quantized version, what kind of error are you running into?

Here’s a notebook that explains how to export a pretrained model to the ONNX format: transformers/04-onnx-export.ipynb at master · huggingface/transformers · GitHub

You can also find more details here: Exporting transformers models — transformers 4.2.0 documentation

I don’t see an obvious reason why the generate method should not work after quantization, so as with most things in deep learning the best advice is to just try and see if it does

Topic		Replies	Views
Using onnx for text-generation with GPT-2 🤗Transformers	4	4085	February 3, 2023
Accelerated gpt2-chinese-cluecorpussmall model Beginners	0	409	September 17, 2021
Regarding Quantizing gpt2-xl, gpt2-large, &c 🤗Optimum	2	1345	August 10, 2022
Using GPT-Neo-125M with ONNX Intermediate	3	1356	July 5, 2022
Support for exporting generate function to ONNX? 🤗Transformers	7	2314	February 8, 2023

Gpt2 inference with onnx and quantize

Related topics