Parameters with zero .grad change in value - how to exclude some parameters from backpropagation update

tom · October 31, 2018, 11:25pm

You’re not using an optimizer with momentum by chance?
For those, the momentum will cause updates even when the gradients are zero.
(It’s also doing funny things to the statistics, probably, but with dropout we rarely think about it too much.)

As a trick you can (at least you could last time I checked) set the gradients to “None” instead of just zeroing them and then the parameters won’t be updated.

Some people like TensorboardX.

Best regards

Thomas