What are means of leaf variable and accumulated gradient?

KelleyYin · January 26, 2018, 6:58am

I am a freshman in pytorch . I can’t absolutely understand leaf nodes in computed graph. such as:

net = nn.Linear(3, 3)

This example have two leaf nodes . Therefore, I crudely guess input and output of the net are leaf nodes .
However, what is mean of accumulated gradient in leaf nodes?

Thanks in advance!

jpeg729 · March 24, 2018, 10:19am

The leaf nodes are the weight and bias parameters of the Linear module.
Whenever you run

out = net(input)
loss = loss_fn(out, target)
loss.backward()

The gradients of the loss with respect to the weight parameter of the Linear module are added to net.weight.grad.

Note that running loss.backward() does not replace the gradients stored in net.weight.grad and net.bias.grad, it adds the new gradients to the gradients that are already there. Hence the use of the term “accumulated”.