Loss suddenly increases using Adam optimizer

Thanks a lot for your detailed reply, Munkiti. I run my training all along without any restart. The learning rate for Adam is 1e-3. The network is typical resnet structure.

I will check whether the problem comes from the small denominator with Adam. I will post it when I find a solution.