Quantize a Model before loading it for pre-training?

Is it possible to load the model in a quantized form for pre training it on custom dataset ?
I am currently training the Gpt2 model but it’s taking way too long. I am running it on fp=16. I am wondering if I can run it on even lower precisions and not loose any performance.