Pegasus Model Weights Compression/Pruning

lewtun · June 4, 2021, 4:25pm

hmm this is a bit odd

how did you do the quantization? i just remembered that someone already asked about quantizing pegasus here so maybe you can check whether you can dynamically quantize the model in a similar to how i described it there and then try generating some outputs with the same model in memory (i.e. don’t save and reload)

if that works, then my guess is that from_pretrained doesn’t support loading quantized models (i can have a look) and you might need to do the loading in native pytorch

Topic		Replies	Views
Pegasus Inference for production usecase Beginners	6	1564	February 26, 2021
Pegasus Questions 🤗Transformers	29	3944	July 5, 2021
Convert pagasus to dynamic quantize for best perfomance Beginners	0	174	January 18, 2023
Fast CPU Inference On Pegasus-Large Finetuned Model -- Currently Impossible? Beginners	4	2532	March 1, 2021
Fine-tuning Pegasus Models	33	10119	October 14, 2021

Pegasus Model Weights Compression/Pruning

Related topics