T5-small performance degradation with larger dataset: seeking advice

LecterAnxa · July 4, 2024, 1:14pm

Hello everyone,

I’m new to machine learning and this is my first post here. I’m working on a small project using google/t5-small for address correction (separating irrelevant information from the address). I found a good configuration for training with my initial dataset of about 1600 records. Over time, the dataset has grown to about 2400 records. However, keeping the same configuration, the model’s performance in address correction has worsened after training on this larger dataset.

Here’s the current training configuration for the model:

epochs = 30
batch_size = 10
run_name = f"b_{batch_size}_e_{epochs}_{current_timestamp}"

training_args = TrainingArguments(
    output_dir='./results',
    num_train_epochs=epochs,
    per_device_train_batch_size=batch_size,
    per_device_eval_batch_size=batch_size,
    warmup_steps=1000,
    weight_decay=0.005,
    logging_dir='logs/{}'.format(run_name),
    logging_steps=50,
    eval_strategy="steps",
    eval_steps=50,
    report_to="tensorboard",
    run_name=run_name,
    load_best_model_at_end=True,
    metric_for_best_model="eval_loss",
    greater_is_better=False
)

trainer = Trainer(
    model=model,
    args=training_args,
    train_dataset=train_dataset,
    eval_dataset=val_dataset
)

What changes could I try to improve the model’s performance with this larger dataset?

Thanks for any suggestions.

Topic		Replies	Views
Fine-tune T5-small but lower performance Models	0	1407	April 21, 2022
T5-base results are worse than t5-small Models	0	237	November 9, 2023
mT5/T5v1.1 Fine-Tuning Results Models	16	7471	March 8, 2022
Replicating SQuAD results on T5 Models	2	682	January 17, 2023
T5 finetuning metrics not improving 🤗Transformers	1	341	June 20, 2023

T5-small performance degradation with larger dataset: seeking advice

Related topics