What is the recommended way to do inference with low precision during training?

It seems converting the model type will change the training loss.
For instance, the training loss of a) and b) is inconsistent:
a):

train(model)
model.half()
eval(model)
model.float()

b):

train(model)
eval(model)

I have to use deepcopy to solve the issue:

train(model)
model_copy = deepcopy.copy(model).half()
eval_model(model_copy)

Is there better way to evaluate the model in fp16 during training without hard-copy the model?

Hello @hu22nlp,

Are you using mixed precision? If yes, then the inference happens with fp16/bf16 weights by default and no changes are required, only the final loss is converted to float32 for stability.