I am a postgraduate student, who want to write a paper by improving RT-DETR’s performance. I used RT-DETR and roadsign datasets, aiming to train a traffic-detection model and get the baseline. The problem is I use 600+ epoch but there is no sign of convergence. The result is given by picture.
My questions are:
- what happened? Is it a overfitting(but val_loss seems to be okay) or my dataset is too simple or other reason?
- what can i do? should i use regularization or change dataset or … ?
more information(if more info is needed, please tell me):
model: RT-DETR
dataset: 800+ pics, train/val/test: 8/1/1
epoch: 600+, optimizer: AdamW, bs=156, gpu=’0,1,2’, workers=24, cache=true,backbone=resnet18。
training on 3 GPUs(A6000)
