I am working on an image classification problem on a Custom Dataset consisting of
1888 training images and
800 validation images(8 classes). I have tried applying transfer learning using various models from the
torchvision.models library. For each model, I am using pre-trained weights and am only training the final
Linear layer which performs classification. I have the following results so far on my validation set(using
batch size 32 and using
SGD with momentum as the optimizer with
1. Alexnet - 93 2. VGG16 - 93 3. VGG16_bn - 57 4. Resnet50 - 26 5. VGG19 - 91 6. VGG19_bn - 59
I have repeated the experiments with both
model.eval() but the results do not seem to change much. So, from the results, I am guessing that the models having
BatchNorm layers are performing very poorly compared to other models that don’t have them. Any ideas why this might be happening? Any help would be appreciated. Thanks!
I have looked at this post on the forums about a possible solution which involves increasing the momentum value in the BatchNorm constructor.
How exactly am I supposed to make this change? Do I have to manually change the
batchnorm.py code or is there any better way to make this happen?