Why multiplying the output of T5 by some scalar before LM head?

I’m wondering why multiplying the outputs of T5 by some scalar before inputting in the LM head :

(Link to the original issue : https://github.com/huggingface/transformers/issues/5565)