Backward issue on second iteration

Vendrick17 · February 7, 2022, 3:33pm

Hi everyone, I am working with a generative network and I got this error

RuntimeError: Trying to backward through the graph a second time (or directly access saved variables after they have already been freed). Saved intermediate values of the graph are freed when you call .backward() or autograd.grad(). Specify retain_graph=True if you need to backward through the graph a second time or if you need to access saved variables after calling backward.

This is the code that it is causing it:

for (input_image, _), (target_image, _) in zip(dataloader_A, dataloader_B):
        generator_optimizer.zero_grad()
        discriminator_optimizer.zero_grad()

        input_image, target_image = input_image.cuda(), target_image.cuda()
        gen_output = netG(input_image)

        disc_real_output = netD(input_image, target_image)
        disc_generated_output = netD(input_image, gen_output)

        gen_total_loss, gen_gan_loss, gen_l1_loss = generator_loss(disc_generated_output, gen_output, target_image)
        gen_total_loss.backward()

        disc_loss = discriminator_loss(disc_real_output, disc_generated_output)
        disc_loss.backward()

        generator_optimizer.step()
        discriminator_optimizer.step()

I know the error can be fixed by adding the retain_graph=True while stating the first backward, however, my doubt is mainly about why this happens if I am working with 2 different loss functions, I implemented the DCGAN from the Pytorch docs and the structure is pretty similar as you can check in the following code but the retain_graph is not required:

        netD.zero_grad()
        real_cpu = data[0].to(device)
        b_size = real_cpu.size(0)
        label = torch.full((b_size,), real_label, dtype=torch.float, device=device)
        output = netD(real_cpu).view(-1)
        errD_real = criterion(output, label)
        errD_real.backward()
        D_x = output.mean().item()
        noise = torch.randn(b_size, nz, 1, 1, device=device)
        fake = netG(noise)
        label.fill_(fake_label)
        output = netD(fake.detach()).view(-1)
        errD_fake = criterion(output, label)
        errD_fake.backward()

Thanks in advance.

anantguptadbl · February 7, 2022, 3:44pm

@Vendrick17

The mistake that you are doing is calling the zero_grad, backward and step on both the models in parallel. They have to be done sequentially

input_image, target_image = input_image.cuda(), target_image.cuda()
gen_output = netG(input_image)
disc_real_output = netD(input_image, target_image)
disc_generated_output = netD(input_image, gen_output)

generator_optimizer.zero_grad()
gen_total_loss, gen_gan_loss, gen_l1_loss = generator_loss(disc_generated_output, gen_output, target_image)
gen_total_loss.backward()
generator_optimizer.step()

discriminator_optimizer.zero_grad()
disc_loss = discriminator_loss(disc_real_output, disc_generated_output)
disc_loss.backward()
discriminator_optimizer.step()

Vendrick17 · February 7, 2022, 5:32pm

Thanks for replying, I made the changes exactly as the one you shared but the error message is the same.