I am trying to train two networks at once and calling two backward passes on two different losses but that breaks everything:
Trying to backward through the graph a second time, but the buffers have already been freed. Specify retain_graph=True when calling backward the first time.
I do it this way:
optimizer_d.zero_grad()
npArray = randomNpArray(1, 3, 4, 11)
randomTensor = numpyToTensor(npArray).cuda().float()
guess_real = discriminator(image)
loss_d_real = criterion(guess_real, Tensor([1,0]).cuda() )
realLossesArray.append(loss_d_real.item())
generatedImage = generator(randomTensor)
guess_fake = discriminator(generatedImage)
loss_d_fake = criterion(guess_fake, Tensor([0,1]).cuda() )
fakeLossesArray.append(loss_d_fake.item())
loss_d = loss_d_real + loss_d_fake
loss_d.backward()
optimizer_d.step()
# -------------------
optimizer_d.zero_grad()
optimizer_g.zero_grad()
loss_g = criterion( guess_fake, Tensor([1,0]).cuda() )
generatorLossesArray.append(loss_g.item())
loss_g.backward()
optimizer_g.step()
I don’t understand how to make PyTorch know that the second calculation of the gradient is completely irrelevant to the first one.
I’d appreciate any help.