Example DeTr Object Detectors not predicting after fine tuning

chuston-ai · June 1, 2023, 10:25pm

I setup a DetrForObjectDetection fine-tuning run with a custom dataset. The model loss bottomed out around 2.0 and predicts zero boxes when applied to even training images.

To debug what I’d done, I stepped through the Object Detection demonstration in the documentation here:
Object detection
Running the demonstration notebook locally, the model trained on cppe5 data converges to a loss of ~1.8 and also predicts zero boxes.

Does the above demonstration work for anyone? If so, any guesses about what might be amiss? I spun up a from scratch ec2 instance and tried it there, same result. Also tried it with an Amazon DLAMI (ami-098c378a13f6a51bc) - both on g4dn.xlarge - also no predictions.

Thanks,

Chris

Some environment info:

$ nvcc --version
nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2022 NVIDIA Corporation
Built on Tue_Mar__8_18:18:20_PST_2022
Cuda compilation tools, release 11.6, V11.6.124
Build cuda_11.6.r11.6/compiler.31057947_0

packages:

[(x.__name__,getattr(x, '__version__')) for x in [transformers, accelerate, evaluate, datasets, torch, torchvision, timm]]
[('transformers', '4.29.2'),
 ('accelerate', '0.19.0'),
 ('evaluate', '0.4.0'),
 ('datasets', '2.12.0'),
 ('torch', '1.13.1+cu116'),
 ('torchvision', '0.14.1+cu116'),
 ('timm', '0.9.2')]]

huggingEars · October 5, 2023, 6:50pm

I went straight to the object detection example. After resolving some compile issues, I was able to execute all of the instructions in Google Colab without errors but a few warnings. The fine-tuned model is saved to my HuggingFace account. In the example when the model is used both in the pipeline as well as the manual use - it does not recognize anything in the image (a medical worker with a mask and coveralls). I also went to the model location in HuggingFace under my account and tried in from the web page. It did not recognize the image, “No object was detected”. I tried this with the example picture as well as a picture from the initial model (a horse).

Not sure how to debug without any errors or obvious warnings. Any thought/suggestions would be helpful.

huggingEars · October 5, 2023, 6:52pm

Here is the run in Google Colab:

huggingEars · October 5, 2023, 8:35pm

@chuston-ai did you ever figure out the problem?

@devonho and @MariaK , I saw your names in the write-up and Colab for the Object Detector example that trained the DeTr model with the CPPE-5 dataset. The example appears to work but my model does not provide any output for the PPE images or any other image. I included my Colab and run-time results above. Any help would be appreciated.

Thanks,
Eric

markjvickers · December 30, 2023, 10:48pm

I ran into the same issue
I trained on the free T4 GPU on Google colab, and wound up with no results.
Looking at the model card for devonho here, the thing that jumped out at me was that it was trained for 100 epochs, whereas the example showed only 10…

nielsr · March 30, 2024, 2:34pm

Hi,

Thanks for reporting these issues. Opened a Github issue to resolve this: Improve/simplify object detection task guide · Issue #29964 · huggingface/transformers · GitHub

nielsr · May 9, 2024, 10:49am

Update, we now have much better guides as well as official example scripts for object detection. This also includes nice calculations of mAP during training.

Find the guide here: Object detection

Find the scripts (both with Trainer API and Accelerate) here: transformers/examples/pytorch/object-detection at main · huggingface/transformers · GitHub

adityabagrii · July 30, 2025, 2:43am

@neilsr
Hi, the updated code has a few errors while using Custom Datasets as mentioned in the readme, if you check the collate function is receiving the unprocessed batches which causes the code to throw the error that data[“pixel_values”] not available and the code needs to be fixed if you are not working on datasets that are not from HuggingFace Hub to process the batch first and then pass it into the collate function,

i have checked it while working and I am trying to resolve the issue would be better if this issue was raised and fixed, either the instructions in the readme to be updated on how to work with local datasets or to fix the code to process the batch of the local dataset before passing it to the collate function

Topic		Replies	Views
Unable to finetune DETR 🤗Transformers	0	481	April 4, 2023
Detection Transformer (DETR) for text detection in documents Research	0	2038	September 29, 2021
How Do I Fine-Tune a Model for Object Detection? Beginners	0	922	December 14, 2021
Save model using save_pretrainedmethod 🤗Transformers	0	390	October 25, 2022
Some bug in object detect when train in macbook Beginners	1	28	August 28, 2024

Example DeTr Object Detectors not predicting after fine tuning

Related topics