Validation Loss decreasing slightly

Dear All,

I am training my network, I can figure out that I am not overfitting as the validation loss can still decrease as well as the training loss.

However, I see that after training for a while, the validation loss decreases slightly maybe every 11 -12 epochs, at some point I say that it will become steady, however, I can see that it still decreases but then it need long time ( 12 epoch) to decrease.

I wonder why it does not keep decreasing steadily as in the beginning of the epochs or even start to increase which means I started to overfit because then I can not decide on the number of epochs I should train for. Is that a problem with the optimizer learning rate, as I am not using a decayer for the learning rate and just keeping it fixed?

The plot of the validation loss looks like that

this just reflects that gradients and optimizer steps from training become smaller. and the “noise” is obviously from suboptimal stochastic steps in parameter space w.r.t. validation set.

optimizer tuning would maybe achieve faster training loss convergence, and would affect a validation loss graph indirectly. OTOH, big steps may lead to solutions that generalize worse.

@googlebot Thanks a lot for your response.

But, sorry I have not got it clearly. So, do you recommend decreasing the learning rate after a few epochs or increase it?

LR increase by epoch is usually only done as “warmup” for optimizers that collect gradient stats, like adam. And I’m not sure if LR decrease is warranted with that graph. So, in this specific case, overall LR increase can maybe reach same training loss in less epochs, but adding a LR decay scheduler may be advisable (depends on optimizer too).

It is hard to say how to schedule LR optimally - some people even do hyperoptimization runs to find that.

But my point was, your validation graph is fine if the slowdown just reflects slowdown on a training graph. As for reasoning about overfit, you kinda have to “smooth” plot points one way or the other, as the volatility is inherent.