Struggling with a Runtime Error related to in-place operations

Thanks for the code.
It’s quite long so I haven’t looked at it deeply enough to understand all nuances of the training.
However, I guess the inplace error might come from these lines of code:

                actor_loss.backward(retain_graph=True)
                critic_loss.backward(retain_graph=True)
                
                optimizer_actor.step()
                optimizer_critic.step()

Here is seems you are retaining the graph and trying to update the “old” parameters multiple times.
I.e. the vanilla workflow would be:

  • model parameters are in state P0
  • execute forward pass
  • execute backward pass and calculate gradients
  • update parameters using their gradients via optimizer.step()

In your code snippet you are using retain_graph=True, which will keep the intermediate activations to calculate the gradients (from previous steps) again.
However, since the optimier.step() calls were already performed, these gradients would be wrong and thus this error is raised.

Here is a small code snippet of what I mean:

model = nn.Linear(1, 1)
optimizer = torch.optim.SGD(model.parameters(), lr=1e-3)
x = torch.randn(1, 1)

out = model(x)
out.backward(retain_graph=True)
optimizer.step() # works

# gradients would be wrong, out is not calculated by the current parameters
out.backward(retain_graph=False) # error
optimizer.step() 

Would your training routine fit into this error description?