Changing dropout during disltilbert fine tuning

Hello everyone,

I’m trying to fine tune distilbert, but I would like to do it with a different dropout value. I see it is possible since it is part of the distilbert config, as in here:

from transformers import TFDistilBertModel, DistilBertConfig
DISTILBERT_DROPOUT = 0.2
DISTILBERT_ATT_DROPOUT = 0.2
 
# Configure DistilBERT's initialization
config = DistilBertConfig(dropout=DISTILBERT_DROPOUT, 
                          attention_dropout=DISTILBERT_ATT_DROPOUT, 
                          output_hidden_states=True)

My problem is that is not clear how this config can be passed to the trainer

trainer = Trainer(
    model_init=model_init,
    args=training_args,
    train_dataset=sample_train_dataset,
    eval_dataset=encoded_dataset[validation_key],
    compute_metrics=compute_metrics,
    callbacks=[early_stopping_callback],
    tokenizer=tokenizer
)

My question is, how can I pass the distilbert-specific configurations to the trainer? can I just included it as another input to the trainer? should I merge it with the training args?

Thank you for your help.

Hello, could somebody help me? thanks

Hi,

The recommended way is to pass them as keyword arguments to the from_pretrained method. Do note that I’m loading a pre-trained DistilBertForSequenceClassification below, as you need a sequence classification head on top in case you want to fine-tune for sequence classification.

from transformers import DistilBertForSequenceClassification, DistilBertConfig
DISTILBERT_DROPOUT = 0.2
DISTILBERT_ATT_DROPOUT = 0.2
 
# Load  model
model = DistilBertForSequenceClassification.from_pretrained("distilbert/distilbert-base-uncased", dropout=DISTILBERT_DROPOUT, attention_dropout=DISTILBERT_ATT_DROPOUT)

# Pass model to Trainer
trainer = Trainer(
    model=model,
    args=training_args,
    train_dataset=sample_train_dataset,
    eval_dataset=encoded_dataset[validation_key],
    compute_metrics=compute_metrics,
    callbacks=[early_stopping_callback],
    tokenizer=tokenizer
)