I’m using torch version 1.0.0.
In each gradient update step (see last loop in code below) I perform one forward step and loss computation. However when I enter the loop a second time I get the following error
RuntimeError: Trying to backward through the graph a second time, but the buffers have already been freed. Specify retain_graph=True when calling backward the first time.
Playing around I realized I can fix this error by adding the following line to the loop
states = states.clone().detach()
Why does that solve the problem? I do not really understand what that line does (in an autograd tutorial) they used a similar line to “detach the Variable from its history”. What does this mean? I thought the graph gets cleared upon execution of loss.backward() (that’s why I would get the same error if I’d try to run loss.backward() again right after the first time) but I don’t see what additional function .detach() serves. So, to sum up my question:
Why does the code below give an error and what does .clone().detach() do to fix it?
import torch
import torch.nn as nn
sft = nn.functional.softmax
def forward(StateVec,ConnectMatrix,L):
StateVec = StateVec + (2*(-1/8*StateVec - ConnectMatrix.mm(StateVec)))*0.1
pos = L.mm(sft(StateVec, dim=0))
return StateVec, pos
N = 6
"""Toy target"""
target = torch.randn(2,20)
"""Randomly initialise L (which ought to be inferred later)"""
L = torch.randn(2,N)
L.requires_grad_(True)
"""Produce Connectivity Matrix rho"""
rho = torch.zeros(N,N);
for i in range(N):
for j in range(N):
if i == j:
rho[i, j] = 0
elif j == i + 1:
rho[i, j] = 1.5
elif j == i - 1:
rho[i, j] = 0.5
else:
rho[i, j] = 1
rho[-1, 0] = 1.5
rho[0, -1] = 0.5
"""Initialise state vector as states = [0.5,0,0,...]"""
states = torch.Tensor(N,1)
states[0] = 0.5
states.requires_grad_(True)
lr = 0.1 # Learning Rate
for t in range(0, target.shape[1]):
states, pos = forward(states,rho,L)
loss = torch.sum((pos - target[:,t].float().view([2,1]))**2)
loss.backward()
L.data -= L.grad.data * lr
L.grad.data.zero_()