I’m working on an artist classification task (50 classes) with a dataset of 5,311 images. The biggest constraint here is that I strictly cannot use pre-trained weights and must train from scratch.
I’m currently using ResNet18 and facing massive overfitting. My training accuracy hits 99% pretty quickly, but validation accuracy is stuck way lower. I’ve tried adding dropout, using weight decay, extensive learning rate tuning, and even reducing the number of channels in the ResNet architecture to lower the model complexity.
Nothing seems to be closing the train-val gap effectively. Is ResNet18 still too complex for this amount of data without transfer learning?
I’ve never used ResNet18, so this is just uninformed speculation on my part.
Is the constraint that you cannot use pre-trained weights if they were pre-trained by somebody
else on somebody else’s dataset? Or is the constraint that you can’t use pre-trained weights
even if you pre-trained them using your own dataset?
You are trying to train with about 5,000 images, split up as about 100 images per class for 50
classes.
Looking at the ResNet paper cited by pytorch’s ResNet18 documentation, I’m guessing that
pytorch’s pre-trained weights come from training a 1000-class classifier on 1.28 million images,
so about 1,000 images per class.
So it seems plausible that the training set you want to use to train from scratch could well
be too small. ResNet18 has about 11.7 million parameters. I don’t know how to translate the
number of parameters into how much training data you will need, but it seems plausible that
5,000 training images just isn’t enough.
It sounds like you’ve already tried most of the main techniques for reducing overfitting, but
you should definitely try Chris’s suggestion of using data augmentation. However, it’s not
realistic to expect data augmentation to give you one million effective images from a dataset
of just 5,000 images.
Coming back to the pre-training question: Do you have access to a significantly larger dataset
that wouldn’t violate your “no pre-trained weights” constraint? Would it be possible for you to
do your own pre-training on a dataset that’s large enough to avoid overfitting and then fine-tune
on your actual artist-classification use case?