Learning rate vs. Validation accuracy

I’m working on an image classification model and evaluating my model by using 4-fold cross-validation. I got very high accuracies when I use SGD with lr=0.01. However, the graph of validation accuracy vs. epoch doesn’t look good at all, see Fig.1.


Fig. 1: y-axis and x-axis represent validation acc. and the number of epochs, respectively.

Because it’s really hard to see the trend in the validation accuracy, I tried to lower the learning rate (from lr=0.01 to 0.0005)to see if the graph looks better. The result is shown in Fig 2. I also change the number of epochs from 100 to 200 in this experiment.


Fig. 2: the number of epochs is changed from 100 to 200

Here, the graph looks better (not too much oscillation now) but the overall performance is lower.

The reason that I’m worried about the shape of the validation accuracy is that I want to publish my work in the future but the validation accuracy shown in Fig.1 could convince the reviewers that my training procedure is not right.

So, I would like to know if I should be worried about these things or not as I haven’t published any paper before. Any help or clarification would be appreciated.

Thanks,
Surayuth

Based on the plots it seems that your validation accuracy seems to be very noisy and would also depend on the actual fold. How many samples are in each fold (and the overall dataset) as it seems the number of samples might be quite small?

1 Like

Thanks for replying. I have a total of 9,724 images in my dataset. The size of the training set is around 75% of 9,724. If I use an even lower learning rate(0.00001), the validation accuracy isn’t noisy now but the accuracy is not good.

I was just wondering what model you are using. It may cause such issues.

1 Like

VGG16 but other models produce similar results.

It makes sense as the learning rate is too low. Did you check a higher LR with momentum?

A learning rate that is higher than 0.01 is unstable for me (got nan). All the experiments are done using SGD with momentum=0.9.