Evaluation step take longer then training step

urizlo · October 23, 2023, 9:02am

Hello,
I am trying to fine-tuning CodeT5+ 220M model on custom dataset.
this is part of my code:

model = AutoModelForSeq2SeqLM.from_pretrained(checkpoint,
torch_dtype=torch.float32,
trust_remote_code=True,).to(device)

training_args = Seq2SeqTrainingArguments(
output_dir=“local_model”,
evaluation_strategy=“epoch”,
learning_rate=2e-5,
per_device_train_batch_size=16,
per_device_eval_batch_size=16,
weight_decay=0.01,
save_total_limit=3,
num_train_epochs=20,
predict_with_generate=True,
fp16=True,
remove_unused_columns=False,
logging_dir=“TensorBoard”,
do_train=True,
do_eval=True,
logging_strategy=‘steps’,
logging_steps=500,
eval_steps=500,
generation_max_length=106,
generation_num_beams=2
)

data_collator = DataCollatorForSeq2Seq(tokenizer=tokenizer, model=checkpoint)

trainer = Seq2SeqTrainer(
model=model,
args=training_args,
train_dataset=model_inputs[‘train’],
eval_dataset=model_inputs[‘test’],
tokenizer=tokenizer,
data_collator=data_collator,
compute_metrics=compute_metrics
)

trainer.train()

Here’s my question:

Making generation_max_length larger only affects the time it takes to perform one evaluation step, it becomes longer. However, it does not affect the training step time.

The training step of the model is usually much faster than the evaluation step, with the exception of when generation_max_length is less than 10.

In my knowledge training step it takes longer than the evalution step because it calculates gradients and backpropogates.

Would it be possible for someone to explain what’s going on to me?

Thakns

Topic		Replies	Views
Evaluation step very slow 🤗Transformers	1	854	February 21, 2024
Seq2seq evaluation speed is slow 🤗Transformers	7	3812	June 20, 2023
Long wait time between evaluate and save (checkpoint creation) Beginners	9	532	September 16, 2024
Evaluation results (metric) during training is different from the evaluation results at the end 🤗Transformers	4	3224	September 26, 2022
It takes so long before the model start training, wav2vec2 fine-tuning 🤗Transformers	2	2220	April 12, 2021

Evaluation step take longer then training step

Related topics