`seq_classif_dropout = 0.2` what is the use of adding dropout after the classification network

While using Distillbert model from hugging face i found out we are having a dropout layer after the classification layer. Before applying softmax, why are we droping out informations ? It seems like a bad idea for me, but want to know more because hugging face had set this to 0.2 as default parameter Is there any good reason behind this