Hello there! I’m using HF’s Optimum library, version 1.1.0 as it’s the latest release, and I’ve bumped into some problems. I’m only using the ONNX part of the library.
First of all , the docs seem to list 1.0.0 as the last stable release, if I’m not mistaken, and there are some minimal breaking changes, e.g. v.1.1.0’s “quantizer._onnx_config” instead of “quantizer.onnx_config” which is mentioned in the docs.
Second, I’m trying to save the ORTConfig to disk according to the docs/examples/notebooks , but when I’m calling ORTConfig.from_* (any source, json, dict, disk) the library doesn’t try to instantiate an Optimum config, instead it loads a transformers config which doesn’t recognise any optimum keys (“optimum_version”, “quantization”, “optimization” etc.) , so it’s impossible to load and run the quantized model as I can’t load its config.
My personal hotfix is to dump and load quantizer._onnx_config using pickle, which is not really useful for plenty of reasons.
Last (though this one is more of a classic Github Issue and I will post it there), there was no option to load a custom (not shared on the hub) Dataset that was saved to disk, as Optimum’s quantization.py includes a load_dataset() call which does not allow loading from disk. Replacing that line of code with a condition and load_from_disk if the dataset is not found in the Hub fixes that issue.
Has anyone encountered similar problems? Or has anyone managed to actually quantize with Optimum, save everything and load it from scratch in order to make predictions using HF’s stack (not directly ONNX using simply the exported model)