Evaluation step take longer then training step

Hello,
I am trying to fine-tuning CodeT5+ 220M model on custom dataset.
this is part of my code:

model = AutoModelForSeq2SeqLM.from_pretrained(checkpoint,
torch_dtype=torch.float32,
trust_remote_code=True,).to(device)

training_args = Seq2SeqTrainingArguments(
output_dir=“local_model”,
evaluation_strategy=“epoch”,
learning_rate=2e-5,
per_device_train_batch_size=16,
per_device_eval_batch_size=16,
weight_decay=0.01,
save_total_limit=3,
num_train_epochs=20,
predict_with_generate=True,
fp16=True,
remove_unused_columns=False,
logging_dir=“TensorBoard”,
do_train=True,
do_eval=True,
logging_strategy=‘steps’,
logging_steps=500,
eval_steps=500,
generation_max_length=106,
generation_num_beams=2
)

data_collator = DataCollatorForSeq2Seq(tokenizer=tokenizer, model=checkpoint)

trainer = Seq2SeqTrainer(
model=model,
args=training_args,
train_dataset=model_inputs[‘train’],
eval_dataset=model_inputs[‘test’],
tokenizer=tokenizer,
data_collator=data_collator,
compute_metrics=compute_metrics
)

trainer.train()

Here’s my question:

Making generation_max_length larger only affects the time it takes to perform one evaluation step, it becomes longer. However, it does not affect the training step time.

The training step of the model is usually much faster than the evaluation step, with the exception of when generation_max_length is less than 10.

In my knowledge training step it takes longer than the evalution step because it calculates gradients and backpropogates.

Would it be possible for someone to explain what’s going on to me?

Thakns