Example DeTr Object Detectors not predicting after fine tuning

I setup a DetrForObjectDetection fine-tuning run with a custom dataset. The model loss bottomed out around 2.0 and predicts zero boxes when applied to even training images.

To debug what I’d done, I stepped through the Object Detection demonstration in the documentation here:
Object detection
Running the demonstration notebook locally, the model trained on cppe5 data converges to a loss of ~1.8 and also predicts zero boxes.

Does the above demonstration work for anyone? If so, any guesses about what might be amiss? I spun up a from scratch ec2 instance and tried it there, same result. Also tried it with an Amazon DLAMI (ami-098c378a13f6a51bc) - both on g4dn.xlarge - also no predictions.


  • Chris

Some environment info:

$ nvcc --version
nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2022 NVIDIA Corporation
Built on Tue_Mar__8_18:18:20_PST_2022
Cuda compilation tools, release 11.6, V11.6.124
Build cuda_11.6.r11.6/compiler.31057947_0


[(x.__name__,getattr(x, '__version__')) for x in [transformers, accelerate, evaluate, datasets, torch, torchvision, timm]]
[('transformers', '4.29.2'),
 ('accelerate', '0.19.0'),
 ('evaluate', '0.4.0'),
 ('datasets', '2.12.0'),
 ('torch', '1.13.1+cu116'),
 ('torchvision', '0.14.1+cu116'),
 ('timm', '0.9.2')]]

I went straight to the object detection example. After resolving some compile issues, I was able to execute all of the instructions in Google Colab without errors but a few warnings. The fine-tuned model is saved to my HuggingFace account. In the example when the model is used both in the pipeline as well as the manual use - it does not recognize anything in the image (a medical worker with a mask and coveralls). I also went to the model location in HuggingFace under my account and tried in from the web page. It did not recognize the image, “No object was detected”. I tried this with the example picture as well as a picture from the initial model (a horse).

Not sure how to debug without any errors or obvious warnings. Any thought/suggestions would be helpful.

Here is the run in Google Colab:

@chuston-ai did you ever figure out the problem?

@devonho and @MariaK , I saw your names in the write-up and Colab for the Object Detector example that trained the DeTr model with the CPPE-5 dataset. The example appears to work but my model does not provide any output for the PPE images or any other image. I included my Colab and run-time results above. Any help would be appreciated.


I ran into the same issue
I trained on the free T4 GPU on Google colab, and wound up with no results.
Looking at the model card for devonho here, the thing that jumped out at me was that it was trained for 100 epochs, whereas the example showed only 10…


Thanks for reporting these issues. Opened a Github issue to resolve this: Improve/simplify object detection task guide · Issue #29964 · huggingface/transformers · GitHub

Update, we now have much better guides as well as official example scripts for object detection. This also includes nice calculations of mAP during training.

Find the guide here: Object detection

Find the scripts (both with Trainer API and Accelerate) here: transformers/examples/pytorch/object-detection at main · huggingface/transformers · GitHub