grad of grads seems to fail on multiple gpus with the following error:
RuntimeError: arguments are located on different GPUs at /pytorch/torch/lib/THC/generated/…/generic/THCTensorMathPointwise.cu
Small snippet:
interp_points = Variable(some_tensor, requires_grad=True) errD_interp_vec = netD(interp_points) errD_gradient, = torch.autograd.grad(errD_interp_vec.sum(), interp_points, create_graph=True) lip_est = (errD_gradient).view(batch_size, -1).sum(1) lip_loss = penalty_weight*((1.0-lip_est)**2).mean(0).view(1) lip_loss.backward()
If the backward is computed directly to netD(interp_points) everything is fain. netD is wrapped in Data parallel table.
Does anyone have any idea?
Thanks!