Hey there! I am trying out an implementation of the Improved WGAN. I am unfortunately picking up this error : RuntimeError: Scatter is not differentiable twice
when trying to get perform gradient_penalty.backward()
. Could someone help? The relevant code is below:
gradient_penalty = get_grad_pen(self.Dis_net, X, X_f.cpu().data, lmbda)
gradient_penalty.backward()
get_grad_pen
is defined as follows:
def get_grad_pen(Dis_net, real_data, fake_data, lmbda):
epsilon = t.FloatTensor(real_data.size(1), real_data.size(2), real_data.size(3)).uniform_(0, 1)
epsilon.expand_as(real_data)
interpolated_data = real_data*(epsilon) + fake_data*(1 - epsilon)
interpolated_dataV = V(interpolated_data.cuda(), requires_grad=True)
gradients = t.autograd.grad(outputs=Dis_net(interpolated_dataV).mean(0).view(1), inputs=interpolated_dataV, create_graph=True, retain_graph=True, only_inputs=True)[0]
grad_pen = ((gradients.norm(2, dim=1) - 1).pow(2)).mean().mul(lmbda)
return grad_pen
V
is torch.autograd.Variable
, is case you were wondering.
Also, note that this happens only on a GPU. If the model is transferred to a CPU i.e., interpolated_dataV = V(interpolated_data, requires_grad=True)
, then this is not observed.
Apparently, I think this only happens if we use multi-GPU support. When I tried without nn.parallel.data_parallel
, it seems to be working.