Hello! What’s the correct way to save a model that’s being trained using Zero3 and bf16 mixed precision via the Accelerate implementation of DeepSpeed?
The closest I can find is setting the stage3_gather_16bit_weights_on_model_save
flag in the config, but from my understanding setting this to true would save it as fp16
, while leaving it as false would require me to run zero_to_fp32.py
, which would save the model weights as fp32 .
How do I preserve the bf16 weights?