Documentation for reinforce()

Hi all,

I’m having trouble understanding how and when to use a.reinforce(), despite the examples. In particular, why do I need it if I want to implement REINFORCE ?

Thanks for your help

1 Like

You need to call .reinforce() on outputs of stochastic function (.bernoulli(), .normal() and .uniform() at the moemnt), if you want to have autograd estimate the gradient of the expectation of the reward. No need to use it if you don’t do any sampling.


You could implement REINFORCE manually, it’s just a convenient way of doing that.

Thanks, that’s just what I needed !