Unbalanced training with BERT

tueboesen · July 27, 2020, 7:20pm

I have been looking further into this, and it seems like BERT (huggingfaces implementation) for some reason learn inbalanced.

I have a very small vocabulary since I’m training on genetic data, but you can basically think of it as if I’m just using all the letters in the alphabet as my vocabulary.

I’m plotting a confusion matrix as I’m training, and what I am see when I’m training is that two of these letters will be predicted more than 90% of the time, even though they are only slightly more common than the other in their natural occurence.

Have any of you seen this kind of inbalance before during training of a BERT network? or have any advice as to why this might be?

Topic		Replies	Views
Getting unexpected results for fine tuned bert model Beginners	0	270	February 9, 2024
Dealing with Imbalanced Datasets? Research	1	5459	March 11, 2021
Ensuring Consistency in Results: A Focus on Reproducibility BERT 🤗Transformers	2	86	October 3, 2024
Train test split (70/20/10(test)) & evaluation using Trainer Beginners	0	1010	August 25, 2023
Questions about my first code on fine-tuning BERT model for text-classification Beginners	0	1507	April 26, 2022

Unbalanced training with BERT

Related topics