LSTM backward error: Trying to backward through the graph a second time, but the buffers have already been freed

ahmadnakib · August 30, 2019, 5:34pm

I am trying to backprop the loss from an LSTM- MDN network and get the following error:
RuntimeError: Trying to backward through the graph a second time, but the buffers have already been freed. Specify retain_graph=True when calling backward the first time.

I am trying to pass in 3200 vectors of size 128 into my network which has 1024 hidden units.

model = RNNMDN(128, 1024, 8 , 1).to(device)
hidden = model.initialize_hidden(32)
x = encoded_tensors.view(-1, 100, 128)

My input should now change from [3200, 128] to [32, 100, 128] (so 32 batches, sequence sizes of 100 and input dimensions of 128)

inputs = x[:, 0:25, :]
targets = x[:, 25:50, :]
hidden = detach(hidden)
(pi, mu, sigma), hidden = model(inputs, hidden)

loss = mdn_loss_function(pi, sigma, mu, targets)
print(loss)

This gives: tensor(1.4209, device=‘cuda:0’, grad_fn=)
However, when i do loss.backward() it gives the runtime error, and even when i place retain_graph=True inside the backward() function, it still gives the same error.

My loss function is as follows:

def mdn_loss_function(y, out_pi, out_mu,out_sigma):
    result = torch.distributions.Normal(loc=out_mu, scale=out_sigma)
    y = y.view(-1, sequence, 1, 128)
    result = torch.exp(result.log_prob(y))
    result = torch.sum(result * out_pi, dim=2)
    result = -torch.log(result)
    return torch.mean(result)