I’m trying to understand the working of backward hook and thus playing around with it to observe and learn.
I applied a backward hook on a simple covnet. The hook looks something like this:
def hook_function(m,i,o): estimated_gradient = torch.ones_like(i) estimated_gradient = torch.unsqueeze(estimated_gradient, dim=0) return estimated_gradient
Here I’m return a tensor of 1s as the input gradient for the next layer to use. I print these values after doing the forward propagation (but before doing loss.backwards and optimizer.step)
The code looks something like this -
#print(list(model.parameters()).shape) # I used this to identify the particular layer I want to print print(list(model.parameters()).grad)
This prints the grads of the last layer in my covnet . The code where I apply these hooks on the layers of my model is as follows (I am applying these hooks on all the linear layers of my covnet-
def register_backward_hook_for_(Model): target_modules =  print(list(Model.modules())) for m in Model.modules(): if isinstance(m,nn.Linear): target_modules.append(m) # print(target_modules) for modules in target_modules: modules.register_full_backward_hook(hook_function)
Expected output - I expected these values to be 1, as I did not perform any backward computation (printing it before my loss.backward call)
Actual output - Real valued outputs
What is going wrong? Or am I missing something about the working of these hooks?