Error Training a DCGAN Model: RuntimeError: Trying to backward through the graph a second time

Ethereal · November 6, 2020, 3:01pm

I am a beginner using PyTorch and I encountered the following error upon training a DCGAN.

RuntimeError: Trying to backward through the graph a second time, but the saved intermediate results have already been freed. Specify retain_graph=True when calling backward the first time.

Here’s the code for my training loop

# training loop

n_epochs = 10

img_list = []

G_losses = []

D_losses = []

iters = 0 

print("Starting training loop...")

print(f"Running on {device}")

for epoch in range(n_epochs):

  for i, data in enumerate(train_loader, 0):

    # Setup

    real_images = data[0].to(device)

    batch_size = real_images.size(0)

    real_label_vector = torch.full((batch_size, ), real_label, dtype = torch.float, device = device) # Vector filled with 1 length 64

    fake_label_vector = torch.full((batch_size, ), fake_label, dtype = torch.float, device = device) # Vector filled with 0 length 64

    #######################################

    #### Training of the Discriminator ####

    ######################################

    # Forward Propagation of Real Images

    netD.zero_grad()

    real_output = netD(real_images).view(-1) # Flattens the matrix

    

    errD_real = criterion(real_output, real_label_vector)

    errD_real.backward()

    D_x = real_output.mean().item()

    # Forward Propagation of Fake Images

    noise = torch.randn(batch_size, input_size, 1, 1, device=device)

    # Generate Fake Image

    fake_image = netG(noise)

    # Feed the fake image to the discriminator

    fake_output = netD(fake_image).view(-1)

    errD_fake = criterion(fake_output, fake_label_vector)

    errD_fake.backward()

    D_G_z1 = fake_output.mean().item()

    errD = errD_real + errD_fake

    optimizerD.step()

    #################################

    ### Training of the Generator ###

    #################################

    netG.zero_grad()

 

    # Since we updated D, we will perform another forward propagation to netD

    fake_output = netD(fake_image).view(-1)

    

    errG = criterion(fake_output, real_label_vector) # netG must see fake_image as "real"

##############################################
    errG.backward() ##### The error happened here
    

    D_G_z2 = fake_output.mean().item()

    optimizerG.step()

    if i % 50 == 0:

      print(f'\n[{epoch}/{num_epochs}]-[{i}/{len(train_loader)}] Loss_D: {errD.item()}, Loss_G: {errG.item()}')

      print(f'D(G(z)): {D_G_z1}/{D_G_z2}')

    if (iters % 500 == 0) or ((epoch == num_epochs-1) and (i == len(dataloader)-1)):

      with torch.no_grad():

          fake = netG(fixed_noise).detach().cpu()

      img_list.append(vutils.make_grid(fake, padding=2, normalize=True))

      iters += 1

I really don’t know where I went wrong since I have just adapted the code from https://pytorch.org/tutorials/beginner/dcgan_faces_tutorial.html but made it so that it generates handwritten digits instead.

The error happened in

 errG.backward()

ptrblck · November 7, 2020, 7:33am

You’ve dropped the detach() operation from the tutorial in:

# Classify all fake batch with D
output = netD(fake.detach()).view(-1)

which will then try to compute the gradients in netG twice: the first time using errD_fake.backward(), the second time using errG.backward().