Learning rate vs. Validation accuracy

Surayuth_Pintawong · July 21, 2022, 4:33pm

I’m working on an image classification model and evaluating my model by using 4-fold cross-validation. I got very high accuracies when I use SGD with lr=0.01. However, the graph of validation accuracy vs. epoch doesn’t look good at all, see Fig.1.

Fig. 1: y-axis and x-axis represent validation acc. and the number of epochs, respectively.

Because it’s really hard to see the trend in the validation accuracy, I tried to lower the learning rate (from lr=0.01 to 0.0005)to see if the graph looks better. The result is shown in Fig 2. I also change the number of epochs from 100 to 200 in this experiment.

Fig. 2: the number of epochs is changed from 100 to 200

Here, the graph looks better (not too much oscillation now) but the overall performance is lower.

The reason that I’m worried about the shape of the validation accuracy is that I want to publish my work in the future but the validation accuracy shown in Fig.1 could convince the reviewers that my training procedure is not right.

So, I would like to know if I should be worried about these things or not as I haven’t published any paper before. Any help or clarification would be appreciated.

Thanks,
Surayuth

ptrblck · July 22, 2022, 12:12am

Based on the plots it seems that your validation accuracy seems to be very noisy and would also depend on the actual fold. How many samples are in each fold (and the overall dataset) as it seems the number of samples might be quite small?

Surayuth_Pintawong · July 22, 2022, 3:10am

Thanks for replying. I have a total of 9,724 images in my dataset. The size of the training set is around 75% of 9,724. If I use an even lower learning rate(0.00001), the validation accuracy isn’t noisy now but the accuracy is not good.

mxahan · July 22, 2022, 3:25am

I was just wondering what model you are using. It may cause such issues.

Surayuth_Pintawong · July 22, 2022, 6:08am

VGG16 but other models produce similar results.

mxahan · July 23, 2022, 5:18am

It makes sense as the learning rate is too low. Did you check a higher LR with momentum?

Surayuth_Pintawong · July 23, 2022, 7:28am

A learning rate that is higher than 0.01 is unstable for me (got nan). All the experiments are done using SGD with momentum=0.9.