Here is a minimum nontrivial example that make me confused:
Example1 : passed
import torch x = torch.nn.Linear(3,3) a = torch.autograd.Variable(torch.randn(2, 3)) for i in range(100): y = torch.sum(x(a)) y.backward()
Example2 : failed. RuntimeError: Trying to backward through the graph a second time, but the buffers have already been freed. Specify retain_graph=True when calling backward the first time.
import torch x = torch.nn.Linear(3,3) a = torch.autograd.Variable(torch.randn(2, 3)) y = torch.sum(x(a)) for i in range(100): y.backward()
What confused me is that in principle, the ‘graph’ will be freed after calling y.backward() in the first example and obviously x,a are contained in the graph. Then x, a will be freed too. So I did not expect the first example will pass since during the second y.backward() call we are not able to locate x and a. The second example is even more confusing to me. Just moving y=torch.sum() outside the loop will create an error.
So my question is: in both cases when the ‘graph’ is created? which autograd.Variables/nn.Module are contained in the graph? If the graph is freed after y.backward() is called, why the first example can pass? Why the second one can’t?