I am running in the backpropagation error that states that “Trying to backward through the graph a second time, but the buffers have already been freed. Specify retain_graph=True when calling backward the first time”.
However, I don’t see where is my code trying to run through the graph a second time. Let me show you two examples, the first which works fine, and the second which does not, and I do not understand why it fails.
x = torch.randn( (5, 4), requires_grad=True )
theta = torch.randn( (4, 3), requires_grad=False )
optimizer = optim.Adam( [x] )
xx = x # Apparently this is the conflicting line
for k in range(10):
m = torch.matmul( xx, theta )
optimizer.zero_grad()
m.sum().backward()
optimizer.step()
This works fine. However, if I change the xx declaration to simply
xx = x/10
then I get the backpropagation error. Also setting “retain_graph=True” as suggested simply makes “x” not optimise at all, it behaves as if “requires_gradient” were set to “False”, which is not the behaviour I need.
Part of your computation graph is outside the loop (xx = x/10).
After the first optimizer.step() call, the intermediate values will be cleared and thus x cannot be optimized anymore. If you move xx = x/10 into the loop, the code should work.
With that division I was trying to reduce the variance of the initial random sampling in X, so the logical way for me to do so was simply once outside the loop. Ideally I would first have declared XX with the reduced variance and then added it to the optimiser, but if done like that pytorch complains that XX is not a leaf variable; ie
x = torch.randn( (5, 4), requires_grad=True )
xx = x/10
optimizer = optim.Adam( [xx] ) # ValueError: can't optimize a non-leaf Tensor
Is there any other way to have a tensor in the optimiser that is the result of some previous computations?