I’m using torch version 1.0.0.
In each gradient update step (see last loop in code below) I perform one forward step and loss computation. However when I enter the loop a second time I get the following error
RuntimeError: Trying to backward through the graph a second time, but the buffers have already been freed. Specify retain_graph=True when calling backward the first time.
Playing around I realized I can fix this error by adding the following line to the loop
states = states.clone().detach()
Why does that solve the problem? I do not really understand what that line does (in an autograd tutorial) they used a similar line to “detach the Variable from its history”. What does this mean? I thought the graph gets cleared upon execution of loss.backward() (that’s why I would get the same error if I’d try to run loss.backward() again right after the first time) but I don’t see what additional function .detach() serves. So, to sum up my question:
Why does the code below give an error and what does .clone().detach() do to fix it?
import torch import torch.nn as nn sft = nn.functional.softmax def forward(StateVec,ConnectMatrix,L): StateVec = StateVec + (2*(-1/8*StateVec - ConnectMatrix.mm(StateVec)))*0.1 pos = L.mm(sft(StateVec, dim=0)) return StateVec, pos N = 6 """Toy target""" target = torch.randn(2,20) """Randomly initialise L (which ought to be inferred later)""" L = torch.randn(2,N) L.requires_grad_(True) """Produce Connectivity Matrix rho""" rho = torch.zeros(N,N); for i in range(N): for j in range(N): if i == j: rho[i, j] = 0 elif j == i + 1: rho[i, j] = 1.5 elif j == i - 1: rho[i, j] = 0.5 else: rho[i, j] = 1 rho[-1, 0] = 1.5 rho[0, -1] = 0.5 """Initialise state vector as states = [0.5,0,0,...]""" states = torch.Tensor(N,1) states = 0.5 states.requires_grad_(True) lr = 0.1 # Learning Rate for t in range(0, target.shape): states, pos = forward(states,rho,L) loss = torch.sum((pos - target[:,t].float().view([2,1]))**2) loss.backward() L.data -= L.grad.data * lr L.grad.data.zero_()