Evaluation loss depends on batch size

ItaiP24 · October 14, 2024, 8:30pm

I train a token classification model on a private dataset.
I noticed that the evaluation loss has different values if I change the value of per_device_eval_batch_size.
Is it a known thing?
The total test set size is not divisible with the batch size (per_device_eval_batch_size*gpu_num).
When I put a batch size that divides the total test set size, I still get a different number from what I get when I calculate the loss directly with torch.nn.CrossEntropyLoss.
Also, accuracy, recall, etc do not change with the batch size.
Thanks for you help!

John6666 · October 14, 2024, 11:46pm

Apparently a known but neglected issue for several years…

Topic		Replies	Views
Batch size during training vs batch size during evaluation Beginners	1	1881	August 27, 2023
[Trainer] Evaluation loss changes with batch size 🤗Transformers	2	16	July 7, 2025
Batch size in trainer eval loop DeepSpeed	3	4551	April 22, 2022
Eval_batch_size VS per_device_eval_batch_size DeepSpeed	0	888	August 4, 2023
How to correctly measure inference time? Intermediate	0	935	July 25, 2022

Evaluation loss depends on batch size

Related topics