In the grad_fn ,I find a next_functions ,But I don't understand the meaning of the attribute


(wuyongyu) #1
a=Variable(torch.randn(1),requires_grad=True)
p=a*a
p_tmp = p.expand_as(p)
grad_acc = p_tmp.grad_fn.next_functions[0][0]
                 

I dont know the meaning of ‘next_functions’,and I couldn’t find the define of it.


(Thomas V) #2

First, don’t use Variable anymore! Be modern!

Regarding your question, the next_functions will allow you to traverse the recorded calculation graph (“backward graph”).
The backward graph will end in AccumulateGrad nodes for the leaves (they have a .variable attribute pointing to the leaf tensor) - and yours does pretty quickly as you only have one operation. Let’s have a slightly more elaborate one:

a = torch.randn(1, requires_grad=True)
b = a*(a+2)
print (b.grad_fn.next_functions)
print (b.grad_fn.next_functions[1][0].next_functions)
print (b.grad_fn.next_functions[0][0].variable is a)

gives

((<AccumulateGrad object at 0x7fbe7aa96780>, 0), (<AddBackward0 object at 0x7fbe7aa96748>, 0))
((<AccumulateGrad object at 0x7fbe7aa96780>, 0), (None, 0))
True

So in ‘x*(x+2)’ you have one branch for x and one for x+2 and the latter has a x branch and an uninteresting 2 branch.
Except at the leaves, you cannot, in general access the variables of the calculation from the graph.

Best regards

Thomas


(wuyongyu) #3

@tom thanks very much ! I remember that and be modern. :grin::grin:


(saluto) #4

Thanks for your explanation. Can you explain what the second element in each of these inner tuples stands for? They are all 0 and I cannot create an example causing a different value. Can they be e.g. 1?


(Thomas V) #5

The number is the input number to the next backward function, so can only be non-zero when a function has multiple differentiable outputs (there aren’t that many, but e.g. the RNN functions typically do).
A minimal example that doesn’t serve much purpose except showing you a 1 is:

a, b = torch.randn(2, requires_grad=True).unbind()
c = a+b
print(c.grad_fn.next_functions)

Unbind is not that well-known, it is the “opposite” of stack, splitting a tensor along one (the first by default) dimension into a list.

Best regards

Thomas


(saluto) #6

Thanks, again. Now I understand.


(saluto) #7

Sorry for bothering you again. Off topic, but probably not worth a new thread: Do you have an explanation for the following behaviour regarding unbind()?

a = torch.randn(2, requires_grad=True)
a0, a1 = a.unbind()
# a0, a1 = a  # but this works
a0.backward()

Causes this error:

RuntimeError: Expected a Tensor of type Variable but found an undefined Tensor at position #1 for iterable argument #0 'tensors'

Best regards,
Saluto


(Thomas V) #8

fixed in master:

Best regards

Thomas


(saluto) #9

Really great, thanks!