Hi all. I get the similar problem. My bug is probably that I use the wrong combination of Softmax and Loss function, so value of grads are super small.
For me,
My model doesn’t seem to be training.
Upon checking a = list(model.parameters())[0].clone() and b = a = list(model.parameters())[0].clone() before and after the call to loss.backward() and optimizer.step(). a==b returns false
Upon printing list(model.parameters())[0].grad it returns a matrix of all super small number like in order of 10^-8.