`target_sizes` and `output.logits` do not align in `image_processor.post_process_object_detection`

jb-bc · September 3, 2024, 9:43pm

I am trying to finetune RT-DETR on two GPUs following this script. The batch size is 8 using 2 GPUs (8 per GPU).

It seems that when reaching the compute_metric method, I seem to get a mismatch between output.logits and target_sizes. The batch dimension of output.logits is 8 while that of target_sizes is 16. This is the stacktrace message:

 File "/home/jb/.cache/pypoetry/virtualenvs/ml-Mf12zaqr-py3.11/lib/python3.11/site-packages/transformers/models/rt_detr/image_processing_rt_detr.py", line 1062, in post_process_object_detection
    raise ValueError(
ValueError: Make sure that you pass in as many target sizes as the batch dimension of the logits
 50%|██████████████████████████████████████████████████████████████████████████████████▌                                                                                  | 77/154 [00:39<00:39,  1.95it/s]

I suspect that the target_sizes tensor is gathering all the images from all devices when maybe it shouldn’t? I would appreciate any help!

Topic		Replies	Views
ValueError: Make sure that you pass in as many target sizes as the batch dimension of the logits 🤗Transformers	2	117	January 1, 2025
Object Detection with images of different sizes 🤗Transformers	0	357	May 25, 2023
Target size (torch.size([16])) must be the same as input size (torch.size([16, 9])) Beginners	1	1023	December 20, 2022
Multilabel sequence classification with Roberta value error expected input batch size to match target batch size 🤗Transformers	1	4246	March 2, 2021
TypeError: only size-1 arrays can be converted to Python scalars 🤗Transformers	1	1995	October 30, 2020

`target_sizes` and `output.logits` do not align in `image_processor.post_process_object_detection`

Related topics