REINFORCE: Why centralize rewards?

freiburgzuo · April 12, 2017, 1:43pm

This line centralizes the rewards, is there a specific reason, since the original algorithm does not mention the centralization.

ebetica · April 12, 2017, 3:12pm

This is a classic trick used in a lot of different papers, normalizing the rewards really speeds up learning a lot.