You can only reinforce a stochastic Function once

Is this problem avoidable?

If non-null rewards are very rare, it makes sense to “bootstrap” the same past action-reward more than once if the received reward wasn’t null.

Is it possible to do it? Or should we just avoid Variable.reinforce in that case?

You could use something like this if you want to be able to reinforce vars cumulatively:

def reinforce(var, reward):
    if var.creator.reward is torch.autograd.stochastic_function._NOT_PROVIDED:
        var.creator.reward = reward
    else:
        var.creator.reward += reward
2 Likes

Thanks, it works!

By curiosity, what is the reason to reinforce a stochastic function only once in the usual way?