Does fp16 training compromise accuracy?

nbroad · May 17, 2022, 9:49pm

Mixed precision training (fp16) is only possible on certain hardware and in some cases results in training instability depending on if the model was pre-trained using bfloat16.

For older GPUs (before Volta/Turing), fp16 provides no speed up and will require more memory because both the fp16 values and fp32 values will be stored in memory. I’d recommend reading this: Performance and Scalability: How To Fit a Bigger Model and Train It Faster

Topic		Replies	Views
Model pre-training precision database: fp16, fp32, bf16 🤗Transformers	4	7055	December 3, 2022
Does it ever make sense to finetune w fp32 if the base model was trained w fp16? Intermediate	1	749	July 8, 2022
Mixed precision for bfloat16-pretrained models 🤗Transformers	2	12383	April 21, 2021
Can I use fp16 model for mixed precision training? 🤗Transformers	0	296	January 16, 2024
Mixed Precision training (fp16), how to use in production? 🤗Transformers	1	924	July 7, 2022

Does fp16 training compromise accuracy?

Related topics