Explosive gradient in DenseNet

Abhishek_Gupta · March 29, 2024, 5:01pm

Recently I am training my DenseNoet model for image classification for emotion detection. I am using Adam optimizer. Problem is that I have put learning rate as low as 1e-16 and max_norm as 10 in nn.clip_grad_norm_. and training is still showing abrupt change in training accuracy.

Anyone who has faced the same. Please help me.