Why does the normal distribution have `loc` and `scale` as attributes?

Superficially, like many things in PyTorch, this is in alignment with NumPy, or here scipy.stats.

Going more in depth, this is to generalize shifting (loc) and coordinate-scaling (scale) distribution from a “standard” location and scale. For example, the lognormal distribution’s scale parameter is doing scaling, too, and not just changing σ.
Wikipedia has an entry on scale parameter elaborating on the concept.

Best regards

Thomas

1 Like