Question and advice on how to fine tune distilbert for multilabel classification

poipii · February 9, 2021, 4:55pm

I am currently tuning distilbert for sequence classification on a multi label specifically 3 labels for sentiment classification on my own custom dataset and I am getting quite high loss values of between 0.5 to 0.4 and I have tried various methods like trying learning rates like 3e-05 to 1e-05 and adding dropout rate of 0.4 to 0.3 for the embedding and 0.4 to 0.2 to sequence classification layer is there other ways of reducing the loss

BramVanroy · February 9, 2021, 5:05pm

Maybe you are not training long enough. Is your validation loss much higher than your training loss?
Maybe you do not have enough data
Maybe your dataset is very imbalanced
Maybe the problem is simply too hard and your labels are too similar
That is just a small range of learning rate. Try starting from 1e-03 and decrease until you see that your train/validation loss curve is looking promising
Use a lr scheduler

Topic		Replies	Views
Why is my DistilBERT model performing poorly on some classes despite hyperparameter tuning? Beginners	15	111	March 9, 2025
Getting 40% accuracy. Need suggestions to improve! Beginners	12	3016	December 7, 2023
Loss Increases But Metrics Get Better? Beginners	0	427	August 28, 2023
Distilbert-base-multilingual-cased' Beginners	2	580	June 22, 2021
Fine tune for multilabel classification, shapes mismatch Beginners	0	396	December 11, 2021

Question and advice on how to fine tune distilbert for multilabel classification

Related topics