Hi everyone!
I’m trying to understand the working of backward hook and thus playing around with it to observe and learn.
I applied a backward hook on a simple covnet. The hook looks something like this:
def hook_function(m,i,o):
estimated_gradient = torch.ones_like(i[0])
estimated_gradient = torch.unsqueeze(estimated_gradient, dim=0)
return estimated_gradient
Here I’m return a tensor of 1s as the input gradient for the next layer to use. I print these values after doing the forward propagation (but before doing loss.backwards and optimizer.step)
The code looks something like this -
#print(list(model.parameters())[6].shape) # I used this to identify the particular layer I want to print
print(list(model.parameters())[6].grad[0])
This prints the grads of the last layer in my covnet . The code where I apply these hooks on the layers of my model is as follows (I am applying these hooks on all the linear layers of my covnet-
def register_backward_hook_for_(Model):
target_modules = []
print(list(Model.modules()))
for m in Model.modules():
if isinstance(m,nn.Linear):
target_modules.append(m)
# print(target_modules)
for modules in target_modules:
modules.register_full_backward_hook(hook_function)
Expected output - I expected these values to be 1, as I did not perform any backward computation (printing it before my loss.backward call)
Actual output - Real valued outputs
What is going wrong? Or am I missing something about the working of these hooks?