I’ve fine-tuned a roberta model and a deberta model both in fp16. The deberta was pre-trained in fp16.
But I want to use the model for production.
Is it possible to convert the fp16 model to onnx precision 16 and use in production?
I’ve fine-tuned a roberta model and a deberta model both in fp16. The deberta was pre-trained in fp16.
But I want to use the model for production.
Is it possible to convert the fp16 model to onnx precision 16 and use in production?
When I try i have to use floating point 32 even though it makes no difference. Should I be looking into bf16?