# How to understand `backward` of stochastic functions?

How to understand the `backward()` in stochastic functions ?

e.g. For Normal distribution, `grad_mean = -(output - mean)/std**2`, however why it is following this formula ? Is it a derivative of Gaussian PDF ? The forward pass only use `output = mean + std*eps` where eps ~ N(0, 1), so the gradient w.r.t. mean should be identity ?