Optimum v1.1.0 breaking problems

aneof · April 24, 2022, 1:51pm

Hello there! I’m using HF’s Optimum library, version 1.1.0 as it’s the latest release, and I’ve bumped into some problems. I’m only using the ONNX part of the library.

First of all , the docs seem to list 1.0.0 as the last stable release, if I’m not mistaken, and there are some minimal breaking changes, e.g. v.1.1.0’s “quantizer._onnx_config” instead of “quantizer.onnx_config” which is mentioned in the docs.

Second, I’m trying to save the ORTConfig to disk according to the docs/examples/notebooks , but when I’m calling ORTConfig.from_* (any source, json, dict, disk) the library doesn’t try to instantiate an Optimum config, instead it loads a transformers config which doesn’t recognise any optimum keys (“optimum_version”, “quantization”, “optimization” etc.) , so it’s impossible to load and run the quantized model as I can’t load its config.
My personal hotfix is to dump and load quantizer._onnx_config using pickle, which is not really useful for plenty of reasons.

Last (though this one is more of a classic Github Issue and I will post it there), there was no option to load a custom (not shared on the hub) Dataset that was saved to disk, as Optimum’s quantization.py includes a load_dataset() call which does not allow loading from disk. Replacing that line of code with a condition and load_from_disk if the dataset is not found in the Hub fixes that issue.

Has anyone encountered similar problems? Or has anyone managed to actually quantize with Optimum, save everything and load it from scratch in order to make predictions using HF’s stack (not directly ONNX using simply the exported model)

echarlaix · April 26, 2022, 10:53am

Hi @aneof

Thanks for reporting the issues you had been encountering.

The issue coming from the instantiation of a pretrained ORTConfig is fixed with PR#156.

Also the ORTConfig handles all the ONNX Runtime parameters (such as the optimization and quantization parameters).

The configuration you are mentionning with quantizer._onnx_config is actually the ONNX configuration, which describes the metadata on how to export the model through the ONNX format (see here for more details).

When doing inference, we shouldn’t need the latter and this will be solved soon with PR#113.

Finally concerning the calibration dataset, you don’t need to use the get_calibration_dataset method as you can load your dataset and then give it as an argument when instantiating your AutoCalibrationConfig, as done in the examples.

Topic		Replies	Views
Optimize AND quantize with Optimum 🤗Optimum	11	3294	February 10, 2024
Optimum library optimization and quantization fails 🤗Optimum	8	1565	February 22, 2025
Load pytorch trained model via optimum 🤗Optimum	5	2816	August 10, 2022
Improving Quantization Accuracy for ONNX Models with Optimum 🤗Optimum	0	735	February 8, 2024
Getting ValueError when exporting model to ONNX using optimum 🤗Optimum	16	5066	November 25, 2022

Optimum v1.1.0 breaking problems

Related topics