Is there a way to get the gradient of all the “parent” nodes of a node during the execution of backprop (when viewing the graph going backward, e.g. the output node is the root and the input nodes are the leaves)?
For example, in the graph of this expression,
c = a * b
out = a + c
the node a
has two paths from out -> a
, one directly to out
and one through c
(shown below).
a ------- out
\ /
\ /
\ /
\ /
b----c
I’d like to manipulate the gradient that gets computed for a
by running an arbitrary operation on the gradients accumulated from the parents of a
, in this case c
and out
, probably by using a prehook on the nodes in the computation graph.
While I can get the parent grad_fns of the grad_fn
of a node, I’m having trouble figuring out how to get the associated gradient with that node. I was originally hoping I could do parent_grad_fn.variable.grad
, but it looks like these parent grad functions don’t have a .variable
attribute.
I’ve also attached a minimal example in colab here showing how I’ve been able to access the parent grad functions but not the actual grad value associated with them.
Thanks for the help!