Backward error with retain_graph=True

mpry · December 12, 2017, 1:10am

for j in range(n_rnn_batches):
    print x.size()
    h_t = Variable(torch.zeros(x.size(0), 20))
    c_t = Variable(torch.zeros(x.size(0), 20))
    h_t2 = Variable(torch.zeros(x.size(0), 20))
    c_t2 = Variable(torch.zeros(x.size(0), 20))
    for s in range(n_steps / n_bptt_steps):
        h_t, c_t, h_t2, c_t2 = Variable(h_t.data), Variable(c_t.data), Variable(h_t2.data), Variable(c_t2.data)
        optimizer.zero_grad()
        for s2 in range(n_bptt_steps):
            data, target = next(iter((train_loader)))
            data, target = Variable(data), Variable(target)
            output = optimizee(data, x)
            loss = F.nll_loss(output, target)
            grads = autograd.grad(loss, x, retain_graph=True)[0].view(x.size(0),-1)
            grads.volatile = False
            grads = grads.detach()

            out, h_t, c_t, h_t2, c_t2 = rnn(grads, h_t, c_t, h_t2, c_t2)
            x = x - out
            print x.volatile
        loss.backward()
        optimizer.step()

I’m trying to implement Learning to Learn by Gradient Descent By Gradient Descent. But I’m getting the Trying to backward through the graph a second time error in the loss.backward() step, even though I have set retain_graph to True. Can someone please help?

smth · December 12, 2017, 1:22am

n_bptt_steps at some point, so it doesn’t go into the inner loop. That’s the reason for your error.

mpry · December 12, 2017, 1:39am

I’m sorry, I didn’t get you. But it does go into the inner loop.

smth · December 12, 2017, 1:41am

not always. at some point it doesn’t go into the inner loop and uses the loss from the last outer loop iteration in loss.backward(). That’s what I think is happening that corresponds to the error message.

mpry · December 12, 2017, 2:01am

I have set n_bptt_steps to 2, hence it always goes into that loop. And I just noticed that the first iteration of the outer loop works fine, but it throws that error in the second iteration(i.e. s=1). Do you know why that maybe? I’m sorry if these questions as silly, but I’m pretty new to pytorch.