I have a question regarding this blogpost. In the calculation of MFU, on bullet point 4, should we divide by 16 A100’s worth of peak FLOPs? This brings MFU to a mere 2.8%. Is this expected or my understanding is off?
I have a question regarding this blogpost. In the calculation of MFU, on bullet point 4, should we divide by 16 A100’s worth of peak FLOPs? This brings MFU to a mere 2.8%. Is this expected or my understanding is off?