Please help understand the purpose of the Dropout layer as the last layer of the TFDistilBertForSequenceClassification model.
Model training on [toxic]
Model: "tf_distil_bert_for_sequence_classification_2"
Layer (type) Output Shape Param #
distilbert (TFDistilBertMain multiple 66362880
pre_classifier (Dense) multiple 590592
classifier (Dense) multiple 1538
dropout_59 (Dropout) multiple 0 <----
Total params: 66,955,010
Trainable params: 592,130
Non-trainable params: 66,362,880
I expected the last layer is classifier(Dense) but it is Dropout. The output is logits with of shape (batch_size, num_labels) but not sure why Dropout layer is there.
Appreciate for help.