How to remove reduction in a register_hook

AlphaBetaGamma96 · August 19, 2020, 6:22pm

Hi!

Thanks for the quick response! The reason I’m asking about this is because I want to get access to the gradient of the loss w.r.t W for all inputs in a batch. So, in effective get the gradient -> d(Loss)/d(W) for all inputs in a batch which would have a dimensionality along the lines of something like [hid_features, in_features, batch]

From my understanding, register_hook gives d(Loss)/d(Z) where Z is the output of a module (In my case nn.Linear) and has the dimensionality [hid_features, in_features]. On the other hand, register_backward_hook gives d(Z)/d(W) which gives grad_output which has the dimensionality of [batch, hid_features] corresponding to the gradient for each input against weight. My understanding mainly comes from this post here: Exact meaning of grad_input and grad_output (I assume the explanation within this post is correct?). The register_backward_hook has the batch dimension within it, and I was wondering if it were possible to get the gradient of the loss with respect to W for all values in the batch in a similar manner to what the register_backward_hook does, except for d(Loss)/d(W) rather than d(Z)/d(W).

Is this something that is possible with PyTorch?

Thank you for the help and clarification!