Reinforcement Learning: RuntimeError: Trying to backward through the graph a second time (or directly access saved tensors after they have already been freed)

Both losses, critic_loss and actor_loss use the advantage tensor in their computation.
The first actor_loss.backward() call will free the intermediate forward activations stored during the previous forward pass, which will cause critic_loss.backward() to fail since both backward passes depend on the computation graph (and the intermediate activations) attached to advantage.
To solve the issue you could use actor_loss.backward(retain_graph=True) or, if it fits your use case, sum both losses together before calling .backward() on the sum.

1 Like