T5-small parameter finetuning translation task

Hello dear forum people,

I have searched the forum but unfortunately not found very much useful for my problem. I also don’t know what exactly is needed to be able to help me so I’d rather write more than too little.

We are using the T5-small for a translation task and our dataset has about 50000 sentences. We want to “translate” German formal sentences into German informal sentences, i.e. perform a text style transfer. We trained our model with the parameters stored in the HuggingFace Course (Fine-Tune for Downstream Tasks: Translation).

Our model works and by inference we also tested if it really learned something and can translate formal sentences to informal sentences (positive result). The next step is to adjust the training parameters to increase the quality and performance. There are so many possible combinations of parameters and of course we have to work with a trial and error method. But we are looking for proven combinations or ways to find out which parameter combinations make more sense than others in the first place (before training).

These are our parameters:

training_args = Seq2SeqTrainingArguments(
output_dir=“./results”,
evaluation_strategy=“epoch”,
learning_rate=2e-5,
per_device_train_batch_size=16,
per_device_eval_batch_size=16,
weight_decay=0.01,
save_total_limit=3,
num_train_epochs=100,
fp16=False,
report_to=“wandb”, # enable logging to W&B
run_name=“persona T5-small” # name of the W&B run (optional)
)

trainer = Seq2SeqTrainer(
model=modelT5,
args=training_args,
train_dataset=tokenized_daten[“train”],
eval_dataset=tokenized_daten[“test”],
tokenizer=tokenizerT5,
data_collator=data_collator,
)

I hope this is enough information to give me some advice. Thanks to all of you who are bothering to find an answer.