One can export the model to ONNX, apply quantization, etc.
This thread can help: Fast CPU Inference On Pegasus-Large Finetuned Model -- Currently Impossible? - #4 by the-pale-king
One can export the model to ONNX, apply quantization, etc.
This thread can help: Fast CPU Inference On Pegasus-Large Finetuned Model -- Currently Impossible? - #4 by the-pale-king