Actually I am struggling for a long time with this problem and had probably tried more than 100 experiments.
I am working on an object detection model for medical imaging problems using the SSD architecture with various resnet backbone. I started and experimented with a lot of hyperparameters with a resnet34 backbone. Training loss curve seems to be okay in almost all the case but validation loss starts to increase even with weight_decay = 0.1
, which is really high value. voc_map
on train_set
comes out to be more than 90%
but on val_set
the maximum which I have achieved using resnet34
is 35%
. I see that model is heavily overfitting. I have used different augmentations that have been told to be okay by the medical experts.
Another weird thing about the loss curve which I observed is that, generally, validation classification loss starts to increase early and after training on more epochs localization loss also starts to increase.
Some other details:
Optimizer: Amsgrad
Classification loss: Cross-Entropy with Hard Negative Mining
Localization Loss: Smooth L1 loss
I am sorry if this is not the correct forum but since I am using PyTorch, I thought maybe someone else in the community might have faced similar issues.