Explosive gradient in DenseNet

Recently I am training my DenseNoet model for image classification for emotion detection. I am using Adam optimizer. Problem is that I have put learning rate as low as 1e-16 and max_norm as 10 in nn.clip_grad_norm_. and training is still showing abrupt change in training accuracy.

Anyone who has faced the same. Please help me.