# Why is error "Trying to backward through the graph a second time..." solved by detachment of variable?

I’m using torch version 1.0.0.

In each gradient update step (see last loop in code below) I perform one forward step and loss computation. However when I enter the loop a second time I get the following error

``````RuntimeError: Trying to backward through the graph a second time, but the buffers have already been freed. Specify retain_graph=True when calling backward the first time.
``````

Playing around I realized I can fix this error by adding the following line to the loop

``````states = states.clone().detach()
``````

Why does that solve the problem? I do not really understand what that line does (in an autograd tutorial) they used a similar line to “detach the Variable from its history”. What does this mean? I thought the graph gets cleared upon execution of loss.backward() (that’s why I would get the same error if I’d try to run loss.backward() again right after the first time) but I don’t see what additional function .detach() serves. So, to sum up my question:

Why does the code below give an error and what does .clone().detach() do to fix it?

``````import torch
import torch.nn as nn

sft = nn.functional.softmax

def forward(StateVec,ConnectMatrix,L):
StateVec = StateVec + (2*(-1/8*StateVec - ConnectMatrix.mm(StateVec)))*0.1
pos = L.mm(sft(StateVec, dim=0))
return StateVec, pos

N = 6

"""Toy target"""
target = torch.randn(2,20)

"""Randomly initialise L (which ought to be inferred later)"""
L = torch.randn(2,N)

"""Produce Connectivity Matrix rho"""
rho = torch.zeros(N,N);
for i in range(N):
for j in range(N):
if i == j:
rho[i, j] = 0
elif j == i + 1:
rho[i, j] = 1.5
elif j == i - 1:
rho[i, j] = 0.5
else:
rho[i, j] = 1

rho[-1, 0] = 1.5
rho[0, -1] = 0.5

"""Initialise state vector as states = [0.5,0,0,...]"""
states = torch.Tensor(N,1)
states = 0.5

lr = 0.1 # Learning Rate
for t in range(0, target.shape):
states, pos = forward(states,rho,L)
loss = torch.sum((pos - target[:,t].float().view([2,1]))**2)
loss.backward()
``````

I think you do not require gradient to `states` variable. Commenting this line will make the code work.

Why `.detach()`ing works?

This is my understanding. You are using the same variable (`states`) that is having `requires_grad = True` again and again. When you use it for the first time, there is no problem. i.e., when you back propagate, the graph is destroyed on the go and the gradient is accumulated in `states.grad` buffer.

When you use the `states` variable second time (I call it `states2`), this is a variable derived from original `states` variable. i.e., there is still a link exists to the original `states` from `states2`. When you back-propagate through `states2`, it is still required to pass on the gradients to original `states` because it has `requires_grad=True`. But the graph between original `states` and `states2` was already destroyed during first back-propagation. Hence it causes `RuntimeError: Trying to backward through the graph a second time, but the buffers have already been freed. Specify retain_graph=True when calling backward the first time.`

This case is similar to (i.e, same variable usage):

1 Like

I see, that makes a lot of sense. Thank you very much!