PyTorch Forums
What's the right way of implementing policy gradient?
reinforcement-learning
11118
(王玮)
August 9, 2017, 12:11pm
8
Emm, here is the full formula
policy is the weight of loss.grad, not the weight of loss itself.
42
1348×316 27.9 KB
show post in topic