Hi! I am trying to run a cycle GAN code, where there is an error in backpropagation (discriminator Loss). The error apparently is in calculating ‘gradientPenalty’. I cannot figure out how to fix it. I would be really thankful if someone can suggest a solution.
Error is following:
RuntimeError: one of the variables needed for gradient computation has been modified by an inplace operation: [torch.cuda.FloatTensor [3, 16, 5, 5]] is at version 3; expected version 2 instead. Hint: the backtrace further above shows the operation that failed to compute its gradient. The variable in question was changed in there or anywhere later. Good luck!
Error is in the line
dLoss.backward(retain_graph=True)
for epoch in range(NUM_EPOCHS_TRAIN):
for param_group in optimizer_d.param_groups:
param_group['lr'] = adjustLearningRate(learning_rate, epoch_num=epoch, decay_rate=DECAY_RATE)
for param_group in optimizer_g.param_groups:
param_group['lr'] = adjustLearningRate(learning_rate, epoch_num=epoch, decay_rate=DECAY_RATE)
for i, (data, gt1) in enumerate(trainLoader_cross, 0):
input, dummy = data
groundTruth, dummy = gt1
trainInput = Variable(input.type(Tensor_gpu)) # X
real_imgs = Variable(groundTruth.type(Tensor_gpu)) # Y
optimizer_g.zero_grad()
fake_imgs = generator(trainInput) # Y'
gLoss = computeGeneratorLoss(trainInput, fake_imgs, discriminator, criterion)
gLoss.backward(retain_graph=True)
optimizer_g.step()
optimizer_d.zero_grad()
realValid = discriminator(real_imgs) # D_Y
fakeValid = discriminator(fake_imgs) # D_Y'
gradientPenalty = computeGradientPenaltyFor1WayGAN(discriminator, real_imgs.data, fake_imgs.data)
dLoss = computeDiscriminatorLoss(realValid, fakeValid, gradientPenalty)
dLoss.backward(retain_graph=True)
optimizer_d.step()
def computeGradientPenaltyFor1WayGAN(discriminator, realSample, fakeSample):
alpha = Tensor_gpu(np.random.random((realSample.shape)))
interpolates = (alpha * realSample + (1 - alpha) * fakeSample).requires_grad_(True)
dInterpolation = discriminator(interpolates)
gradients = autograd.grad(
outputs=dInterpolation,
inputs=interpolates,
grad_outputs=torch.ones(dInterpolation.size()).cuda(),
create_graph=True,
retain_graph=True,
only_inputs=True)[0]
gradients = gradients.view(gradients.size(0), -1)
normGradients = gradients.norm(2, dim=1) - 1
for i in range(len(normGradients)):
if normGradients[i] < 0:
normGradients[i] = 0
gradientPenalty = normGradients.mean()
return gradientPenalty