Optimisation and Quantization of Tensorflow Model

D3v · March 21, 2023, 10:04am

Hey there, I have some doubt regarding optimization and qunatization of TF based model. Actually i’m doing optimization and quantization of ‘“distilbert-base-multilingual-cased”’ model, for pytorch model im able to perform optimization and qunatization. But for my use case i have to used TF based model and perform optimization and quantization but im unable to do it. It’ll helpful if anyone clear me about below questions.

How do i perform Optimisation and Quantisation of any TensorFlow model available in HuggingFace Model Hub.
Does Optimum Library work for TensorFlow model as well, can we use ORTModelxxx class for TensorFlow ?
Optimum[export] has functionality to convert model to onnx format for Tensorflow with level of optimization but has no quantization, so after getting the optimized onnx model how can i quantised.

regisss · May 3, 2023, 6:33pm

Hi @D3v, you can easily export and optimize your TF model with Optimum CLI as follows:

optimum-cli export onnx --model distilbert-base-multilingual-cased --framework tf --optimize O2 my_onnx_model

Then, to quantize it, you can also use the CLI as described here.

Topic		Replies	Views
Optimize AND quantize with Optimum 🤗Optimum	11	3292	February 10, 2024
Optimum library optimization and quantization fails 🤗Optimum	8	1559	February 22, 2025
Optimum & RoBERTa: how far can we trust a quantized model against its pytorch version? 🤗Optimum	10	2404	July 27, 2022
Optimum & T5 for inference 🤗Optimum	18	5812	February 8, 2023
Optimum Exporter TFLite error 🤗Optimum	4	1373	May 18, 2023

Optimisation and Quantization of Tensorflow Model

Related topics