REINFORCE: Why centralize rewards?

This line centralizes the rewards, is there a specific reason, since the original algorithm does not mention the centralization.

This is a classic trick used in a lot of different papers, normalizing the rewards really speeds up learning a lot.