Low Accuracy and Stagnant Validation Accuracy in BERT Model for Multilabel Classification

Xuan5251 · October 7, 2024, 7:09am

Hello everyone,

I’m fine-tuning a BERT-based model for a multilabel classification task. Specifically, I’m trying to predict the Big Five personality traits based on text inputs. However, I’m encountering an issue where both my accuracy and validation accuracy remain quite low (below 30%) and stagnant during training. I’d greatly appreciate any insights or advice on how to address this issue.

Dataset:
I have a dataset containing 9,918 samples, where each sample is a text input and is labeled with binary values (0 or 1) for each of the Big Five personality traits. The traits are represented in the following columns:

E: Extraversion
N: Neuroticism
A: Agreeableness
C: Conscientiousness
O: Openness

Below is the architecture of my model:

# Load the pre-trained BERT model
bert_model = TFDistilBertModel.from_pretrained('distilbert-base-uncased', num_labels=NUM_CLASSES)

from tensorflow.keras.layers import Input, Dense, Layer, Dropout
from tensorflow.keras.models import Model, load_model
import keras

@keras.saving.register_keras_serializable(package="BertLayer")
class BertLayer(Layer):
    def __init__(self, bert_model, **kwargs):
        super().__init__(**kwargs)
        self.bert_model = bert_model

    def call(self, inputs):
        input_ids, attention_mask = inputs
        return self.bert_model(input_ids=input_ids, attention_mask=attention_mask)[0]

    def get_config(self):
        base_config = super().get_config()
        config = {
            "bert_model": keras.saving.serialize_keras_object(self.bert_model),
        }
        return {**base_config, **config}

    @classmethod
    def from_config(cls, config):
        bert_model_config = config.pop("bert_model")
        bert_model = keras.saving.deserialize_keras_object(bert_model_config)
        return cls(bert_model, **config)

def create_model(bert_model, MAX_LENGTH, NUM_CLASSES):
    input_ids = Input(shape=(MAX_LENGTH,), dtype='int32', name='input_ids')
    attention_mask = Input(shape=(MAX_LENGTH,), dtype='int32', name='attention_mask')
    bert_output = BertLayer(bert_model)([input_ids, attention_mask])
    cls_token = bert_output[:, 0, :]

    x = Dense(256, activation='relu')(cls_token)
    x = Dropout(0.1)(x)
    x = Dense(128, activation='relu')(x)
    x = Dropout(0.1)(x)

    output = Dense(NUM_CLASSES, activation='sigmoid', name='output')(x)

    return Model(inputs=[input_ids, attention_mask], outputs=output)

model = create_model(bert_model, MAX_LENGTH, NUM_CLASSES)
model.compile(optimizer=Adam(learning_rate=1e-5),
              loss='binary_crossentropy', 
              metrics=['acc'])

model.summary()

history = model.fit(
    train_dataset,
    epochs=5,  
    validation_data=val_dataset,
    batch_size=64
)

Why might the accuracy and validation accuracy be stagnating at such a low level (~27%)? Is this expected for a dataset like mine, or could there be something wrong with the architecture/training process?

John6666 · October 8, 2024, 9:58am

There seem to be numerous tricks like carefully selecting and processing the data set before training the model, but maybe it’s not so bad that the accuracy is below 30% without it.
Even the examples that didn’t work were more accurate than that.
Maybe there’s something wrong with the program or parameters rather than the dataset, or some feature isn’t working, but I’ve never trained on BERT, so I can’t point to specific mistakes…

Topic		Replies	Views
Why training accuracy and test accuracy on train set is significantly different? Beginners	0	1397	February 28, 2022
Getting 40% accuracy. Need suggestions to improve! Beginners	12	3031	December 7, 2023
BERT for NextSentencePrediction train and inference problem, thanks 🤗Transformers	0	635	February 25, 2022
Low Accuracy in BERT Ensemble Despite Strong Individual Model Performance Beginners	0	10	May 22, 2025
Accuracy changes dramatically 🤗Transformers	0	562	November 23, 2020

Low Accuracy and Stagnant Validation Accuracy in BERT Model for Multilabel Classification

Related topics