If you were to overfit training of Googlenet using Imagenet data, what would be your strategy?
I am trying to do this and all can I get are fairly well generalized results.
My data is the original ImageNet2012 1000-class set and I preprocessed it by taking 3 maxspect subcrops of each image and scaling them to 224x224. This is data I have used in the past with Caffe.
I took the script
github.com / pytorch/examples/blob/main/imagenet/main.py
and turned off augmentation via Transform because my files are preprocessed. I am manipulating the learning rate using StepLR or ReduceLROnPlateau.
Strategies I have tried:
StepLR - train for up to 100 epochs with combinations of step_size and gamma s.t. LR is ~10^-6 at 100th epoch. Gamma values like 0.1, 0.5, 0.67, 0.9, 0.925 and step_size derived from that.
ReduceLROnPlateau - this I am using with default parameters, and mode ‘min’. I’m only on my first attempt with this TBH.
With StepLR, I have used 500 images/class, 1000 images/class, 5000 images/class. With ReduceLROnPlateau I’m using 1000 images/class.
Either way I get extremely well behaved training. StepLR I can make overfit just a bit: val accuracy falls a bit off best result. ReduceLROnPlateau is like super extremely well behaved: val accuracy plateaus and stays there.
It is as if PyTorch is so well made that it really does not want to overfit a model! How do I defeat it?