Hi, I have a problem which is bothering me for a week, any suggestion would be appreciated.
The problem is, when I tried to zeroing out part of the gradients, the corresponding weights are not frozen as expected, these weights are moving in a very small order similar to a truncation error.
In my application, I need to zero out part of the gradients under specific conditions. Here’s an example code to reproduce the error.
The code is modified from the following pytorch example.
The 3 ways I use to zero out gradients, the error is still there.
model.conv1.weight.grad *= 0
model.conv1.weight.grad.fill_(0)
model.conv1.weight.grad.data.zero_()
The error is small so it does not affect the performance much, but I’d like to understand how does this happen. Thanks a lot.