I am encountering the following error when I am training a model using Trainer
provided by huggingface
FutureWarning: Non-finite norm encountered in torch.nn.utils.clip_grad_norm_; continuing anyway. Note that the default behavior will change in a future release to error out if a non-finite total norm is encountered. At that point, setting error_if_nonfinite=false will be required to retain the old behavior.
The loss becomes 0 after 500 steps. Is there a parameter that can be passed to Trainer
to solve this issue