The tutorial says( http://pytorch.org/tutorials/beginner/former_torchies/autograd_tutorial.html#gradients): so if you even want to do the backward on some part of the graph twice, you need to pass in retain_variables = True during the first pass
The example it gives is:
from torch.autograd import Variable
x = Variable(torch.ones(2, 2), requires_grad=True)
y = x + 2
y.backward(torch.ones(2, 2), retain_graph=True)
z = y * y
gradient = torch.randn(2, 2)
But when I try this code with retain_graph=True and retain_graph=False, they both works with no error, and the gradients are corrects.
Anything wrong with the example?
In this specific case you do not need
retain_graph=True, but in general you may need it. As you compute the forward pass, PyTorch saves variables that will be needed to compute the gradients in the backward pass. For example,
z = y * y needs to save the value of
dz/dy = 2*y (or
y + y). However,
y = x + 2 doesn’t need to save anything because
dy/dx = 1 which doesn’t depend on x.
When you call
retain_graph=False (or without specifying it), the automatic differentiation engine frees the saved variables as it computes the gradients. If you call
backwards() again, it will fail with an exception if it needs any freed saved variables. If it doesn’t need any saved variables, like in your example, then it will succeed, but you shouldn’t rely on this behavior.
If you change
y = x + 2 to
y = x * x, you will see an error:
Trying to backward through the graph a second time, but the buffers have already been freed. Specify retain_graph=True when calling backward the first time.
Thanks for the explanation!