Issue with Gradient computation for Generative adversarial Network - on the discriminator loss

Abdulkareem_Moh · December 29, 2021, 8:09pm

Hi All,
I would like to ask for you assistance with respect to the following error:
Error syntax: " one of the variables needed for gradient computation has been modified by an inplace operation: [torch.cuda.FloatTensor [1, 16, 3, 3]] is at version 2; expected version 1 instead. Hint: the backtrace further above shows the operation that failed to compute its gradient. The variable in question was changed in there or anywhere later. Good luck!"
I am getting this error whenever I run the following code.

it = 0
start_time = time.time()
try:
for epoch in range(args[‘epochs’]):
for x in x_batches:
z = encoder(x)
out = decoder(z)
disc = discriminator(torch.lerp(out, x, args[‘reg’]))
alpha = torch.rand(args[‘batch_size’], 1, 1, 1).to(args[‘device’]) / 2
z_mix = lerp(z, swap_halves(z), alpha)
out_mix = decoder(z_mix)
disc_mix = discriminator(out_mix)

        loss_ae_mse = F.mse_loss(out, x)
        loss_ae_l2 = L2(disc_mix) * args['advweight']
        loss_ae = loss_ae_mse + loss_ae_l2
        
        opt_ae.zero_grad()
        loss_ae.backward(retain_graph=True)
        opt_ae.step()
        
        loss_disc_mse = F.mse_loss(disc_mix, alpha.reshape(-1))
        loss_disc_l2 = L2(disc)
        loss_disc = loss_disc_mse + loss_disc_l2
        
        opt_d.zero_grad()
        loss_disc.backward()
        opt_d.step()

        losses['std(disc_mix)'].append(torch.std(disc_mix).item())
        losses['loss_disc_mse'].append(loss_disc_mse.item())
        losses['loss_disc_l2'].append(loss_disc_l2.item())
        losses['loss_disc'].append(loss_disc.item())
        losses['loss_ae_mse'].append(loss_ae_mse.item())
        losses['loss_ae_l2'].append(loss_ae_l2.item())
        losses['loss_ae'].append(loss_ae.item())

        if it % 100 == 0:
            img = status()
            
            plt.figure(facecolor='w', figsize=(10,4))
            for key in losses:
                total = len(losses[key])
                skip = 1 + (total // 1000)
                y = build_batches(losses[key], skip).mean(axis=-1)
                x = np.linspace(0, total, len(y))
                plt.plot(x, y, label=key, lw=0.5)
            plt.legend(loc='upper right')
            
            clear_output(wait=True)
            plt.show()
            show_array(img * 255)
            
            speed = args['batch_size'] * it / (time.time() - start_time)
            print(f'{epoch+1}/{args["epochs"]}; {speed:.2f} samples/sec')

        it += 1

except KeyboardInterrupt:
pass

The answer provided by https://discuss.pytorch.org/t/one-of-the-variables-needed-for-gradient-computation-has-been-modified-by-an-inplace-operation-torch-cuda-floattensor-3-48-3-3-is-at-version-2-expected-version-1-instead/83241/2 seems to be relevant to my case, but still, I am unable to figure out how to refactor the code. I would greatly appreciate your assistance. The full code is available at : https://gist.github.com/kylemcdonald/e8ca989584b3b0e6526c0a737ed412f0

Thank you …