Evaluation Metrics are not matching with Shuffle = False

sriramgs · October 19, 2024, 11:32am

Hi,

I am training a binary image classification model. In my code, I tried to evaluate the model at each epoch. In the dataloader, for the valuation dataset, if I mention shuffle=True, I am getting the macro AUROC score to be greater than 0.5 and it starts to show an increasing trend as we reach higher epochs. But if I use the dataloader with shuffle=False, the macro AUROC score is very low i.e. 0.07 and it does not cross 0.10 value at the end of 100th epoch.

I tried to run the same code without using accelerate (with just a single GPU), at that time I can see the macro AUROC score to be normal i.e. 0.62 (for the same checkpoint, using accelerate gave <0.1).

I am using torchmetrics’ (torch.functional.classification.binary_auroc) for evaluation.

Does anyone encounter similar issue? I would be happy, if someone shares some insights.

EDIT:
My batch_size is 256 with total dataset_size of 300. I tried to print the auroc score at each enumeration:

batch AUROC - batch_size
0.0 - 256
0.6252874135971069 - 44

Somehow, the auroc score of first enumeration is returned as 0.0 and due to that my final auroc score is erroneous.

Topic		Replies	Views
Asymmetric Loss Function has no effect in Accelerate 🤗Accelerate	0	31	October 13, 2024
Potential bug in the rt-detr v2 fine tune script 🤗Transformers	3	237	February 27, 2025
Helping for Evaluation Video Classification Model (with a IterableDataset) 🤗Transformers	0	81	May 29, 2024
Bug on multi-gpu trainer with accelerate 🤗Accelerate	6	430	February 18, 2025
Same metrics after every epoch Beginners	4	351	May 30, 2024

Evaluation Metrics are not matching with Shuffle = False

Related topics