I am training a model with an unsupervised loss function:
For validation, I am printing the same loss function for every epoch:
Although the general trend of my validation curve seems to go down, I am wondering why the validation loss is so unsmooth and very different from epoch to epoch. Could this indicate any kind of overfitting?
By the way, my results are looking fine (except for “bad” epochs) but I am looking for ways to improve them even further.
I am already using weight decay for regularization.
Thanks in advance!
It seems that your training loss is even noisier if I understand the plot correctly.
Which batch size are you using? Often larger batches yield a smoother loss curve.
Yes, that is correct. I am using a batchsize of 4 due to memory limitations.
Are you suggesting that a smoother loss curve will ultimately yield superior results?
I.e. bad gradients <–> noisy loss ?
If yes, how can I achieve this on my current setup without increasing the batchsize? I am already using ~11 GB of my GTX 1080 Ti…
Not necessarily. Although a larger batch size might yield smoother loss curves, the performance might be worse than using noisy smaller batches.
However, you could try to use artificially larger batches by accumulating gradients as described here and compare the results.
Alternatively, you could also use
torch.utils.checkpoint to trade compute for memory.