If use mix precision training using fp16, need torchdynamo (TensorRT)?

I am trying to learn GPU use in doc of hugging face. There is mixed precision training with fp16 and also have Inference with torchdynamo. It can use TensorRT.

TensorRT provides INT8 using quantization-aware training and post-training quantization and FP16 optimizations.

Why do we need to use torchdynamo, if I use mixed precision training with fp16 in the training argument? Is there any special thing?

It’s still rought, but if you are looking for some support for PTQ + static quantization + tensorRT, Accelerated inference on NVIDIA GPUs can be a good reference.

Thank you for the information.

I referred this link - Efficient Training on a Single GPU - Inference with torchdynamo section

It also has TesnorRT and it can use FP16. Also can give the trainer argument about FP16. I need to know if both are different or same thing happens? If both are different things using FP16, then can enable both together for training.

Right - refer to: Repurpose torchdynamo training args towards torch._dynamo by sgugger · Pull Request #20498 · huggingface/transformers · GitHub

The “fx2trt-fp16” backend is not advertised by PyTorch, so I removed it.

It can be pretty experimental.