Both losses, critic_loss
and actor_loss
use the advantage
tensor in their computation.
The first actor_loss.backward()
call will free the intermediate forward activations stored during the previous forward pass, which will cause critic_loss.backward()
to fail since both backward
passes depend on the computation graph (and the intermediate activations) attached to advantage
.
To solve the issue you could use actor_loss.backward(retain_graph=True)
or, if it fits your use case, sum both losses together before calling .backward()
on the sum.
1 Like