Recently I’ve worked on implementing a WGAN-GP (https://arxiv.org/pdf/1704.00028.pdf) by myself.
For those of you who are not familiar with WGAN-GP, it uses a gradient penalty to enforce the Lipschitz-1 constraint on the WGAN’s discriminators function instead of weight clipping.
This gradient penalty term is a function of the gradient of the discriminator’s output on some special kind of input with respect to that input. Thus, we are essentially taking a second derivative, for which we use the function autograd.grad.
My original problem was, that even though I pass the argument retain_graph=True to grad, when I call backward on the loss function I encounter the error ‘Trying to backward through the graph a second time…’.
From debugging, I learned that the error was caused by my model containing a residual layer, that preformed inplace ReLU. I removed the inplace operation and my code works, but I’m still not sure what exactly is the error that is happening and it really annoys me.
I have constructed a toy example that to my understating suffers from the same issue, can anyone please help me understand what exactly is the error occurring?
import torch from torch import nn from torchviz import make_dot def double_backprop(inputs, net): y = net(x).mean() grad, = torch.autograd.grad(y, x, create_graph=True, retain_graph=True) return grad.pow(2).mean() + y class TestNet(nn.Module): """ A network for testing double backprop """ def __init__(self): super(TestNet, self).__init__() def forward(self, input): output = input.transpose(1, 2) output = nn.Conv1d(4, 100, 1)(output) # If I remove either the second ReLU layer, or the inplace argument, this works. output = nn.ReLU(True)(output) output = nn.ReLU()(output) output = output.view(-1, 500) output = nn.Linear(500, 1)(output) return output model = TestNet() x = torch.randn((64,50,4),requires_grad=True) out = double_backprop(x, model) out.backward() # make_dot(out)