Mixed Precision training (fp16), how to use in production?

I’ve fine-tuned a roberta model and a deberta model both in fp16. The deberta was pre-trained in fp16.

But I want to use the model for production.

Is it possible to convert the fp16 model to onnx precision 16 and use in production?

When I try i have to use floating point 32 even though it makes no difference. Should I be looking into bf16?