Example DeTr Object Detectors not predicting after fine tuning

I setup a DetrForObjectDetection fine-tuning run with a custom dataset. The model loss bottomed out around 2.0 and predicts zero boxes when applied to even training images.

To debug what I’d done, I stepped through the Object Detection demonstration in the documentation here:
Object detection
Running the demonstration notebook locally, the model trained on cppe5 data converges to a loss of ~1.8 and also predicts zero boxes.

Does the above demonstration work for anyone? If so, any guesses about what might be amiss? I spun up a from scratch ec2 instance and tried it there, same result. Also tried it with an Amazon DLAMI (ami-098c378a13f6a51bc) - both on g4dn.xlarge - also no predictions.


  • Chris

Some environment info:

$ nvcc --version
nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2022 NVIDIA Corporation
Built on Tue_Mar__8_18:18:20_PST_2022
Cuda compilation tools, release 11.6, V11.6.124
Build cuda_11.6.r11.6/compiler.31057947_0


[(x.__name__,getattr(x, '__version__')) for x in [transformers, accelerate, evaluate, datasets, torch, torchvision, timm]]
[('transformers', '4.29.2'),
 ('accelerate', '0.19.0'),
 ('evaluate', '0.4.0'),
 ('datasets', '2.12.0'),
 ('torch', '1.13.1+cu116'),
 ('torchvision', '0.14.1+cu116'),
 ('timm', '0.9.2')]]

I went straight to the object detection example. After resolving some compile issues, I was able to execute all of the instructions in Google Colab without errors but a few warnings. The fine-tuned model is saved to my HuggingFace account. In the example when the model is used both in the pipeline as well as the manual use - it does not recognize anything in the image (a medical worker with a mask and coveralls). I also went to the model location in HuggingFace under my account and tried in from the web page. It did not recognize the image, “No object was detected”. I tried this with the example picture as well as a picture from the initial model (a horse).

Not sure how to debug without any errors or obvious warnings. Any thought/suggestions would be helpful.

Here is the run in Google Colab:

@Chuston1776 did you ever figure out the problem?

@devonho and @MariaK , I saw your names in the write-up and Colab for the Object Detector example that trained the DeTr model with the CPPE-5 dataset. The example appears to work but my model does not provide any output for the PPE images or any other image. I included my Colab and run-time results above. Any help would be appreciated.