@albanD Hi, does detach erase the existing weights?
So if we do:
policy_loss = …
self.policy_optimizer.zero_grad()
policy_loss.backward()
self.policy_optimizer.step(0
policy_loss.detach()
For some reason when I was doing this in a loop the weights of the policy.parameters() [connected to the policy optimizer] stopped changing. I don’t understand the mechanism for why.