Hi. I have an adversarial training process with a few models and two optimizers. The two optimizers update different parameters, and yet I get this error:
RuntimeError: Trying to backward through the graph a second time, but the saved intermediate results have already been freed. Specify retain_graph=True when calling .backward() or autograd.grad() the first time.
The corresponding code is below:
optimizer_G = torch.optim.Adam(
chain(img_encoder.parameters(), pts_encoder.parameters(),
img_decoder.parameters(), pts_decoder.parameters()),
weight_decay=decay, lr=lr, betas=(b1, b2))
optimizer_D = torch.optim.Adam(
discriminator.parameters(), lr=lr, betas=(b1, b2), weight_decay=decay)
...
g_loss = alpha * \
adversarial_loss(discriminator(pts_feats), valid) + \
(1 - alpha) * reconstruction_loss
optimizer_G.zero_grad()
g_loss.backward()
optimizer_G.step()
real_loss = adversarial_loss(discriminator(pts_feats), valid)
fake_loss = adversarial_loss(discriminator(img_feats), fake)
d_loss = (real_loss + fake_loss) / 2
optimizer_D.zero_grad()
d_loss.backward()
optimizer_D.step()
Why do I need to retain the graph when they should be disjoint since each optimizer updates different weights?