Optimum warnings while quantizing

Hey there,

I am currently trying to quantize some mistral-based models. For this, I applied O2 or O3 optimization first, and afterwards ran dynamic quantization. While doing so, I stumbled upon the warning
Failed to infer data type of tensor
quite frequently (on every single layer) as far as I can tell. While this did not concern me too much at the beginning since I am used to warnings while converting, the quantized models raise an error upon loading them, namely google.protobuf.message.DecodeError: Error parsing message

I went down the stacktrace and couldn’t find a better explanation than the quantized model being corrupted. Sadly a simple google and forums search did not bring anyone with a similar issue.

For any ideas what might have caused this, I would be very grateful.
You can find the O3 version I then quantized at LuckiestOne/em_german_leo_mistral_onnx_O3
Furthermore I can provide the quantized versions, even though I doubt they will be of great use.

Thanks a lot!

P.S.: I have renamed to model.onnx_data to model.onnx.data, so maybe just revert this renaming when trying to reproduce.