Error in backpropagation (discriminator Loss) cycle GAN

maryam_hayat · August 21, 2021, 8:39am

Hi! I am trying to run a cycle GAN code, where there is an error in backpropagation (discriminator Loss). The error apparently is in calculating ‘gradientPenalty’. I cannot figure out how to fix it. I would be really thankful if someone can suggest a solution.

Error is following:
RuntimeError: one of the variables needed for gradient computation has been modified by an inplace operation: [torch.cuda.FloatTensor [3, 16, 5, 5]] is at version 3; expected version 2 instead. Hint: the backtrace further above shows the operation that failed to compute its gradient. The variable in question was changed in there or anywhere later. Good luck!

Error is in the line
dLoss.backward(retain_graph=True)

for epoch in range(NUM_EPOCHS_TRAIN):

    for param_group in optimizer_d.param_groups:
        param_group['lr'] = adjustLearningRate(learning_rate, epoch_num=epoch, decay_rate=DECAY_RATE)

    for param_group in optimizer_g.param_groups:
        param_group['lr'] = adjustLearningRate(learning_rate, epoch_num=epoch, decay_rate=DECAY_RATE)

    for i, (data, gt1) in enumerate(trainLoader_cross, 0):
        input, dummy = data
        groundTruth, dummy = gt1
        trainInput = Variable(input.type(Tensor_gpu))           # X
        real_imgs = Variable(groundTruth.type(Tensor_gpu))      # Y

        optimizer_g.zero_grad()
        fake_imgs = generator(trainInput)                       # Y'
        gLoss = computeGeneratorLoss(trainInput, fake_imgs, discriminator, criterion)
        gLoss.backward(retain_graph=True)
        optimizer_g.step()
        optimizer_d.zero_grad()

        realValid = discriminator(real_imgs)                    # D_Y
        fakeValid = discriminator(fake_imgs)                    # D_Y'

        gradientPenalty = computeGradientPenaltyFor1WayGAN(discriminator, real_imgs.data, fake_imgs.data)

        dLoss = computeDiscriminatorLoss(realValid, fakeValid, gradientPenalty)
        dLoss.backward(retain_graph=True)
        optimizer_d.step()

def computeGradientPenaltyFor1WayGAN(discriminator, realSample, fakeSample):
alpha = Tensor_gpu(np.random.random((realSample.shape)))
interpolates = (alpha * realSample + (1 - alpha) * fakeSample).requires_grad_(True)
dInterpolation = discriminator(interpolates)

gradients = autograd.grad(
    outputs=dInterpolation,
    inputs=interpolates,
    grad_outputs=torch.ones(dInterpolation.size()).cuda(),
    create_graph=True,
    retain_graph=True,
    only_inputs=True)[0]

gradients = gradients.view(gradients.size(0), -1)
normGradients = gradients.norm(2, dim=1) - 1
for i in range(len(normGradients)):
    if normGradients[i] < 0:
        normGradients[i] = 0

gradientPenalty = normGradients.mean()
return gradientPenalty

ptrblck · August 23, 2021, 3:50am

Could you explain why retain_graph=True is used, as it’s often not needed and some users have added it to avoid a previous error (which could then result in your reported issue in case your workflow is similar to this one).