Different backwards through different branches

I don’t know how the different optimizers etc. are initialized but it generally seems you are trying to calculate multiple losses, compute the gradients with them, and step() with different optimizers.
For this to work, make sure the computation graph is alive if you need to reuse it later and also make sure the forward activations aren’t “stale” after an optimizer.step() operation was performed.
This post explains these errors in a GAN setup.

1 Like