# Gradient computation when using forward hooks

Suppose I have a custom `nn.Module` β

``````class Identity(nn.Module):
def __init__(self):
pass
def forward(x):
return x

hooked_layer = Identity()
hookfn = lambda model,input,output: output*2
### hookfn can in principle be complex function
### even non differentiable such as quantization
hooked_layer.register_forward_hook(hookfn)

``````

Now consider activation `A`, which passes through `hooked_layer` to become A_hooked. I understand that in the computation of loss, `A_hooked` is used.
Now I want to understand how will back-propagation happen.
1.) How will the gradient of `A` be computed? Will gradient of `A_hooked` be copied to `A` or will gradient of `A` equal to half of the gradient of `A_hooked`.
2.) In the above case if gradient of `A_hooked` is not copied to gradient of `A` β then what will happen if I use some non-differentiable hook β such as quantization
3.) Lastly in the layers following our original `hooked_layer`, will `A` or `A_hooked` be used in backpropagation

Hi,

I think the simplest way to understand what will happen here is to know that the autograd lives below torch.nn and is completely unaware of what torch.nn does.
So in this case, whatever is the Tensor you give to the rest of the net is the one that will get gradients (it does not matter if it comes from a hook or not).
And in this case, since A_hooked depends on A, then the gradients will flow back from A_hooked to A.

1 Like

Can somebody confirm that forward hook supports backpropagation? For example, I obtain activations (or weights) from particular layers using forward hooks. Then, I compute the L1 norm of these activations (or weights) and added it to my main loss term, i.e. `loss = loss + L1_loss`. Now, when I call `loss.backward()` the gradients will flow through the hooks and then the activations (or weights) will be penalized accordingly right?
Another option I came across was iterating over model parameters (maybe with some if-else statements), which should work just fine, but it doesnβt provide the flexibility I am looking for.