Documentation for reinforce()

cartage · February 22, 2017, 9:44pm

Hi all,

I’m having trouble understanding how and when to use a.reinforce(), despite the examples. In particular, why do I need it if I want to implement REINFORCE ?

Thanks for your help

apaszke · February 22, 2017, 11:19pm

You need to call .reinforce() on outputs of stochastic function (.bernoulli(), .normal() and .uniform() at the moemnt), if you want to have autograd estimate the gradient of the expectation of the reward. No need to use it if you don’t do any sampling.

apaszke · February 22, 2017, 11:19pm

You could implement REINFORCE manually, it’s just a convenient way of doing that.

cartage · February 22, 2017, 11:30pm

Thanks, that’s just what I needed !