Overfitting in resnet-34 vs vgg-16

Hi All,

I trained a dataset (grey-scale ultrasound images. with 15 classes) on vgg-16 (using batch norm version) and resnet-34.
vgg-16 gives me a validation accuracy of 92% where as I can only hit 83% with resnet-34 .

I handled overfitting in both architectures with dropout in FC layer and regularization in optimizer.
I don’t know why there would be minimal overfitting with vgg and not resnet . The resnet model train loss is 0.02 vs valid loss 0.67. Moreover resnet model doesn’t seem to improve beyond this loss range. I have tried hyper-parameter tuning on weight decay, learning rate, momentum, dropout.

The purpose of this exercise to improve our predictions on the dataset. Hence I started trying resnet-34 after vgg-16.

I would love some suggestions on

  1. How to improve resnet-34 performance and address its overfitting.
  2. Any other architecture suggestions that would work for this type of dataset.

@ptrblck @InnovArul would love to hear your thoughts.

I assume you used the same training code and just swapped the VGG model for the ResNet?
Are you using both pretrained models or are you training from scratch?

Also, did you make sure to change the fc attribute in resnet34 for your output linear layer and the classifier attribute in your vgg16_bn model?

Thanks a bunch for the reply. I started with pre-trained model, tuning the classifier/FC layer first and then sequentially unfroze the resnet-34 and vgg16_bn for training. As I unfroze resnet-34 layers the model seems to overfit on training data. Not sure if this is a symptom that I am not accounting for. The proportion of images across all class are constant across train and validation.

Did the validation accuracy get worse after unfreezing the ResNet?
If you are dealing with overfitting, you could try to increase the regularization or add some data augmentation.

When I started unfreezing the layers of resnet the accuracy went from 50% to 83% using SGD optimizer, but it saturates at this stage. I am using data augmentation that was suggested for these type of images. Yes, I should try to spend more quality time on regularization. I have-not experimented with different optimizers yet, not sure if they would make a difference.


I am facing a similar problem. I wanted to compare VGG16, Resnet34 and mobilenetV2 on the same data-set, I obtain a 90% accuracy while using VGG, with the feature extractor being frozen, but Resnet34 and MobillenetV2 tend not to converge at similar levels(- max accuracy was 80%) with feature extractor being frozen.
I use the same learning rates( staring at 1e-4) with annealing , regularizes(L2-1e-4 and 1e-5), and Adam.
Should I try SGD or other parameters.