Changing dropout during disltilbert fine tuning

cifisop556 · May 24, 2024, 3:08pm

Hello everyone,

I’m trying to fine tune distilbert, but I would like to do it with a different dropout value. I see it is possible since it is part of the distilbert config, as in here:

from transformers import TFDistilBertModel, DistilBertConfig
DISTILBERT_DROPOUT = 0.2
DISTILBERT_ATT_DROPOUT = 0.2
 
# Configure DistilBERT's initialization
config = DistilBertConfig(dropout=DISTILBERT_DROPOUT, 
                          attention_dropout=DISTILBERT_ATT_DROPOUT, 
                          output_hidden_states=True)

My problem is that is not clear how this config can be passed to the trainer

trainer = Trainer(
    model_init=model_init,
    args=training_args,
    train_dataset=sample_train_dataset,
    eval_dataset=encoded_dataset[validation_key],
    compute_metrics=compute_metrics,
    callbacks=[early_stopping_callback],
    tokenizer=tokenizer
)

My question is, how can I pass the distilbert-specific configurations to the trainer? can I just included it as another input to the trainer? should I merge it with the training args?

Thank you for your help.

cifisop556 · May 31, 2024, 9:30am

Hello, could somebody help me? thanks

nielsr · May 31, 2024, 12:43pm

Hi,

The recommended way is to pass them as keyword arguments to the from_pretrained method. Do note that I’m loading a pre-trained DistilBertForSequenceClassification below, as you need a sequence classification head on top in case you want to fine-tune for sequence classification.

from transformers import DistilBertForSequenceClassification, DistilBertConfig
DISTILBERT_DROPOUT = 0.2
DISTILBERT_ATT_DROPOUT = 0.2
 
# Load  model
model = DistilBertForSequenceClassification.from_pretrained("distilbert/distilbert-base-uncased", dropout=DISTILBERT_DROPOUT, attention_dropout=DISTILBERT_ATT_DROPOUT)

# Pass model to Trainer
trainer = Trainer(
    model=model,
    args=training_args,
    train_dataset=sample_train_dataset,
    eval_dataset=encoded_dataset[validation_key],
    compute_metrics=compute_metrics,
    callbacks=[early_stopping_callback],
    tokenizer=tokenizer
)

Topic		Replies	Views
Help on training a TensorFlow model for distilbert-squad Beginners	0	405	January 29, 2021
How to change dropout in pre trained model for fine tunning gpt 🤗Transformers	0	882	June 7, 2023
Multi_class_classification errors when fine-tuning via TrainerAPI Beginners	0	374	February 20, 2023
Can trainer.hyperparameter_search also tune the drop_out_rate? Beginners	3	1197	May 7, 2024
Text Classification: pretrained transformer model Distilbert with tweet_eval irony dataset Models	1	235	March 8, 2024

Changing dropout during disltilbert fine tuning

Related topics