What to do for non-finite warning in `clip_grad_norm`?

mralexis · July 25, 2021, 5:11pm

I started to see this warning for a language model training

FutureWarning: Non-finite norm encountered in torch.nn.utils.clip_grad_norm_; continuing anyway. Note that the default behavior will change in a future release to error out if a non-finite total norm is encountered. At that point, setting error_if_nonfinite=false will be required to retain the old behavior.

Is this an indicator that my model is not working well? And if so, is there any recommendation on what to change? Thanks!

mralexis · July 26, 2021, 10:10pm

Any insights on this?

prajjwal1 · September 10, 2021, 10:31pm

@sgugger Can you please provide some insight ? I get this warning even before the training starts with Trainer.

sgugger · September 13, 2021, 7:22am

I have never encountered that warning, so will look into that. It looks like some future change in PyTorch is coming.

Topic		Replies	Views
FutureWarning: Non-finite norm encountered in torch.nn.utils.clip_grad_norm_;results in 0 error Beginners	0	2428	October 22, 2021
Error in clip_grad_norm_ for bf16 via PEFT 🤗Accelerate	1	1415	June 23, 2023
Why is grad norm clipping done during training by default? 🤗Transformers	3	12610	February 17, 2025
FutureWarning close Beginners	4	2320	August 27, 2024
TransformerXL run_clm.py grad can be implicitly created only for scalar outputs Beginners	2	1928	March 22, 2022

What to do for non-finite warning in `clip_grad_norm`?

Related topics