Hi All, I have a dataset which contains about 100,000 email messages that have been labelled with one of 300 labels or so that I am training the model on for the purpose of automated email classification.
During training with the pre-shuffled dataset split 80/20 train/test respectively, my eval accuracy never really gets very good - even after 10-15 epochs (with each epoch taking about 1 hour to train).
I am using the AutoModelForSequenceClassification pre-trained with the distilbert-base-uncased model with the training arguments below:
train_batch_size = 32
eval_batch_size = 8
num_train_epochs = 24
training_args = TrainingArguments(
Anything i can do with improving the training accuracy? Have tried lower learning rates, adjusting the test/train batch sizes and number of epochs and just doesnt seem to be really getting anywhere.
Appreciate any help!