Custom trainer evaluation function

wz1232 · June 20, 2022, 7:21pm

Hi, so I am trying to override the Trainer evaluate() function with my own method. It runs, but the problem is that when I use multi-GPUs (8 in my case), it seems to split the eval dataset across the 8 GPUs but only report the metric calculated for the eval subset on 1 GPU.

My boilerplate code is shown in here: scratch/trainer_eval.py at master · kevinghst/scratch · GitHub

Can someone please help take a look and identify the underlying reason for this behavior?

Thanks,
Kevin

Topic		Replies	Views
HF Trainer downstream evaluation on multiple GPUS 🤗Transformers	1	1073	December 21, 2022
Custom model for Trainer 🤗Transformers	1	382	July 8, 2023
Trainer.evalute() with multi GPUs results Expected all tensors to be on the same device, but found at least two devices, cuda:3 and cuda:0! Beginners	2	79	February 11, 2025
A custom trainer for multi-task learning? 🤗Transformers	1	836	September 18, 2024
Can't use multi GPU in evaluation from Trainer 🤗Transformers	3	929	December 6, 2023

Custom trainer evaluation function

Related topics