Query regarding register_backward_hook

I don’t think the hook will work as expected (see this recent thread for a discussion of the values of the hook Exact meaning of grad_input and grad_output).
If you write the model yourself, you could just use hidden.register_hook(lambda grad: grad.clamp(min=0)), similar to the gradient clipping discussed here) on your activations between the layers.

Best regards

Thomas

1 Like