Multiple Loss Functions in a Model

I see. Even in this case, the final lossD.backward() faces this variable modified in-place scenario.

From @albanD 's answer here:

You can use del lossD instead of final lossD.backward() (to release computational graph). Can you try that?

Edit: Can you pack encoder and decoder into one optimizer (or) backward them together, if possible? The Encoder grad calculation is dependent on Decoder parameters as well. So, you can’t optimizer_decoder.step() before loss_ele.backward(). One solution is as follows:

calculate encoder loss, decoder loss, discriminator loss

# discriminator update
optimizer_disc.zero_grad()
lossD.backward(retain_graph=True)
optimizer_disc.step()

# encoder and decoder update
optimizer_encoder.zero_grad()
optimizer_decoder.zero_grad()

loss_generator = loss_encoder + loss_decoder
loss_generator.backward()

optimizer_decoder.step()
optimizer_encoder.step()

# to release the computation graph of the discriminator
del lossD