Trainer.evaluate() doesn't return evaluation loss

I am using Seq2SeqTrainer to lora finetune madlad 3b model using my own dataset.

When I run trainer.evaluate(), the evaluation loss is not printing.

Code:


output_dir = "madlad_run_5"
training_args = Seq2SeqTrainingArguments(
    output_dir,
    per_device_train_batch_size=32,
    # per_device_eval_batch_size=2,
    num_train_epochs=1,
    # max_steps=1000,
    # gradient_accumulation_steps=8,
    bf16=True,
    torch_compile=True,
    # gradient_checkpointing=True,
    # torch_empty_cache_steps=10,
    learning_rate=3e-4,
    lr_scheduler_type="cosine",
    weight_decay=0.00001,
    warmup_ratio=0.1,
    report_to="tensorboard",
    logging_dir=f"{output_dir}/tensorboard_logs",
    logging_first_step=True,
    logging_steps=1,
    save_strategy="epoch",
    # save_steps=1000,
    save_total_limit=1,
    eval_strategy="steps",
    eval_steps=5,
    include_for_metrics=["loss"],
    # batch_eval_metrics=True,
    optim="paged_adamw_32bit",
    dataloader_pin_memory=True, # optimization
    dataloader_num_workers=4,
    # eval_on_start=True,
)

trainer = Seq2SeqTrainer(
    peft_model,
    args=training_args,
    train_dataset=train_dataset,
    eval_dataset=test_dataset, 
    data_collator=DataCollatorForSeq2Seq(tokenizer=tokenizer, model=model),
    # compute_metrics=compute_metrics,
    # callbacks=[eval_loss_callback]  # Add custom callback
)

eval_results = trainer.evaluate()
print(eval_results)

# trainer.train()

# peft_model.save_pretrained("madlad_run_2_lora_ckpt")

Output:

{'eval_runtime': 10.9764, 'eval_samples_per_second': 9.11, 'eval_steps_per_second': 0.456}

I want to get the evaluation loss also.

Some extra context:
I am using tensorboard for logging and in the initial training runs, I was not getting the evaluation loss also in the logs. So I tried some solutions like adding callbacks, compute_metrics function, etc but the evaluation loss was still not coming.

Then I tried to directly run trainer.evaluate() but here also evaluation loss was missing.

How to get the evaluation loss also when the model is evaluated during training i.e. in the evaluation logs and in trainer.evaluate()?

System configuration:

- `transformers` version: 4.48.3
- Platform: Linux-5.15.0-127-generic-x86_64-with-glibc2.35
- Python version: 3.10.16
- Huggingface_hub version: 0.28.1
- Safetensors version: 0.5.2
- Accelerate version: 1.3.0
- Accelerate config:    - compute_environment: LOCAL_MACHINE
        - distributed_type: MULTI_GPU
        - mixed_precision: no
        - use_cpu: False
        - debug: False
        - num_processes: 3
        - machine_rank: 0
        - num_machines: 1
        - gpu_ids: all
        - rdzv_backend: static
        - same_network: True
        - main_training_function: main
        - enable_cpu_affinity: False
        - downcast_bf16: no
        - tpu_use_cluster: False
        - tpu_use_sudo: False
        - tpu_env: []
- PyTorch version (GPU?): 2.6.0+cu124 (True)
- Tensorflow version (GPU?): not installed (NA)
- Flax version (CPU?/GPU?/TPU?): not installed (NA)
- Jax version: not installed
- JaxLib version: not installed
- Using distributed or parallel set-up in script?: no
- Using GPU in script?: yes
- GPU type: NVIDIA A40

Let me know if any other information is required. Please resolve this.

1 Like

It looks like it will be difficult to get a lossā€¦

I see. But I wonder why is such a basic thing as ā€œevaluation lossā€ so difficult to get with Trainer? Why the design is like that? Is it not important?

I think eval loss is extremely important to track to understand model overfitting right? We can use that to perform early stopping too, but none of this is possible with Trainer.

Iā€™m curious why the design is like that. Any thought?

1 Like