In below code, why the gradient prints number of times as epoch is increase?
import torch.nn as nn
def forward(self, x):
out= x * self.w
self.w.register_hook(lambda grad: print("GRADIENT OF SELF.W IS"))
learning_rate = 0.0005
optimizer = torch.optim.SGD(net.parameters(), lr=learning_rate,momentum=0)
for i in range(0,5):
print("*** Epoch number is ",i)
It’s because self.w is always the same Tensor, and every time you do a forward, you add another hook onto it.
These hooks don’t go away by them selfves so you just have more and more of them here.
Is it consume memory or some diverse effect if we don’t remove it?
But if you add a new one at each iteration, it is expected that at iteration n, you will have n hooks being called.
I think what you want here is to move the hook registration to the
__init__ of your nn.Module so that it is done only once. and so at every iteration, the hook will be called once.
You just point out the right solution. I am not understood how defining in “init”, it calls during backward in every epoch? I just assumed till now “init_” doesn’t play role in forward or backward. It just instantiated once during model creation.
self.w is defined once during the init and then re-used for every forward in the
So if you add a hook to that Tensor, it will be taken into account at every forward as the Tensor is re-used at every forward.
In your code above, if you register the hook on
out you won’t see this behavior anymore because
out is a new Tensor at every forward. The behavior you observe here happens because
self.w is the same Tensor that is re-used at every forward.