How to understand `backward` of stochastic functions?

How to understand the backward() in stochastic functions ?

e.g. For Normal distribution, grad_mean = -(output - mean)/std**2, however why it is following this formula ? Is it a derivative of Gaussian PDF ? The forward pass only use output = mean + std*eps where eps ~ N(0, 1), so the gradient w.r.t. mean should be identity ?

Gradient formulas are based on Simple Statistical Gradient-Following
Algorithms for Connectionist Reinforcement Learning, available at