How to increase YOLOS precision?

I created a custom dataset (400 images with 4 categories, split 80(train) - 20(validation)) for detecting layout elements in scanned pages (images, tables, headers, etc.). I finetuned the hustvl/yolos-base model with it using the tutorial from Niels Rogge:

The finetuning is working and the model is detecting the things i want on test images, but often the bounding box predictions are not as precise as i want them to be and the confidence for some (correctly) detected objects is very low.

So what option (if any) would be promising to get better precision from the model?

  • more training data
  • longer training
  • changing hyperparameters

Thanks for any help and suggestions in advance. I attached the loss plot for more info.