Reinforce deprecated?

notarobot · October 31, 2017, 2:56pm

I’ve being using action.reinforce(reward) for policy gradient based training, but it seems like there’s been a change recently and I get an error stating:

File “/opt/conda/envs/pytorch-py35/lib/python3.5/site-packages/torch/autograd/variable.py”, line 209, in reinforce
if not isinstance(self.grad_fn, StochasticFunction):
NameError: name ‘StochasticFunction’ is not defined

I read on github that .reinforce is being deprecated, and it’s suggested to use torch.distributions.

Is there a reason for this change? Reinforce seemed relatively simple and intuitive. It’ll be great if the reinforce example from pytorch is updated to reflect this change.

richard · October 31, 2017, 5:51pm

Here’s a good thread on the reason for the change. I think it can be summarized in two points: support for multiple stochastic outputs is difficult, and improving performance with Variables.

Kaixhin · November 4, 2017, 3:55pm

If you are on the 0.2 release, reinforce is still available. If you’re on master and have torch.distributions instead, the RL examples should now be as follows: https://github.com/pytorch/examples/pull/249

torch.distributions is much more general and suitable for a larger range of tasks - building the equivalent of reinforce using this is relatively simple (and arguably cleaner as it can be used to create a normal loss function to backpropagate).

notarobot · November 13, 2017, 8:23pm

This helps, thanks a lot!