I think you might be running into a similar issue as described here.
By skimming through the code I’m unsure, why retain_graph
is used in d_loss.backward(retain_graph=True)
and why fake_img
as well as fake_out
are recomputed at the end before the optimizerG.step()
. Could you explain this workflow a bit and check, if the issue from the link might also apply to your use case?