Facing difficulty while using ImageNet Dataset

Soumyajit_Das · August 21, 2019, 5:44am

I selected few classes from Imagenet Dataset. And I finetuned the vgg16_batchNormalized model, by keeping one fully connected layer instead of three fully Connected Layer.

I am checking the test accuracy after every epoch, because I want to save the best model for later use. I am considering the test accuracy to save the best model, i.e, if the model having better test accuracy is the model that attained better minima of loss function and weights are updated accordingly.

Now during first few epochs the Test Accuracy is high as compared to the Train accuracy. I understood that the network has not learned properly and so the prediction is not correct.

As the training goes on,The Training Accuracy gradually increases and the train loss is also getting decreased, also the Test Accuracy increases. Until after some epochs I see, that the Training keeps getting better but the Test Loss starts increasing and thereby the Test Accuracy is also decreasing. Now, here I understand that now the Network has learned better as the loss of training is also less and thereby trying to get more accurate predictions on Test Dataset.

Is my understanding correct?
I am new to Deep Learning and so have less experience, it would be helpful if someone helps me in this.

Also one fact, I do not understand, if the training gets better, that means the loss is getting reduced and accordingly the network parameters(weights) are updated, why is the network giving less Test Accuracy on Test Dataset .

I can share the codes and output logs if its required.
Any help will be appreciated.

ptrblck · August 21, 2019, 11:32am

This is called overfitting and your model starts to learn training set specific features, which do not generalize well to unseen examples.
These features depend on the dataset and you can generally counter this effect by used regularization or increasing data augmentation.

A simple example would be a face classifier:
While the model should learn that a face contains a nose, eyes, mouth, etc., it might overfit to specific features like blue eyes, black hair etc. to detect a face.

Soumyajit_Das · August 21, 2019, 12:13pm

@ptrblck Thank you for replying. I understand it now.

Actually I used regularization(weight_decay in SGD) and also data augmentation(horizontal_flip), but somehow it is still overfitting. Few hours back I tried to add dropout layers to see if the problem of overfitting can be solved. But that too did not solve the problem. Now I do not know whats the next step.