I am trying to train a network on Imagenet dataset and i observed an unusual Behaviour with SGD. My loss function keeps on decreasing and suddenly increases. I am using SGD with momentum with no schedulers. I have attached my train and test plots. Also attached the code for it. I used lr=1e-3.
I am not able to understand why loss suddenly increases.
Thanks in advance for any responses.