Use Multiple Optimizers in one Model

I think you might be running into this error, since optimizer3.step() would update the parameters, which could have been used to calculate loss2 and loss12.backward() would then try to compute the gradients using stale intermediate forward activations (since the corresponding parameters were already updated as described in the linked post).