You can only reinforce a stochastic Function once

alexis-jacq · April 10, 2017, 6:46pm

Is this problem avoidable?

If non-null rewards are very rare, it makes sense to “bootstrap” the same past action-reward more than once if the received reward wasn’t null.

Is it possible to do it? Or should we just avoid Variable.reinforce in that case?

jekbradbury · April 10, 2017, 6:49pm

You could use something like this if you want to be able to reinforce vars cumulatively:

def reinforce(var, reward):
    if var.creator.reward is torch.autograd.stochastic_function._NOT_PROVIDED:
        var.creator.reward = reward
    else:
        var.creator.reward += reward

alexis-jacq · April 10, 2017, 6:57pm

Thanks, it works!

By curiosity, what is the reason to reinforce a stochastic function only once in the usual way?