How to upload a quantized model?

lewtun · June 11, 2021, 8:54am

in case it’s useful, i’ve also answered in another thread some of the main steps you need to re-load the quantized weights using pytorch’s state_dict: Pegasus Model Weights Compression/Pruning - #9 by lewtun

Topic		Replies	Views
Load pytorch trained model via optimum 🤗Optimum	5	2816	August 10, 2022
Error saving quantized model Intermediate	4	3938	February 16, 2023
Saving/Loading custom model build from varying HF models Intermediate	1	1357	March 20, 2023
Save custom transformer as PreTrainedModel Intermediate	1	930	September 7, 2021
Pushing a quantized (4bit) model on the Hub 🤗Transformers	9	4236	January 8, 2024