Dirichlet/Gamma Distribution Sampling And Reparam. Gradients

maxwass · June 3, 2024, 11:00pm

I’m attempting to find the methods used to sample from, and compute the reparameterized gradients of, the Dirichilet and Gamma distributions.

The Dirichlet distribution and the Gamma Distribution both include seemingly hidden functions.

For Dirichlet: torch._sample_dirichlet, torch._dirichlet_grad,

For Gamma: torch._standard_gamma

Is there documentation on the sampling/gradient methods, and these functions?

maxwass · June 14, 2024, 8:51pm

Bumping this. @ptrblck, @smth do you folks know who the correct person to ask about this would be?

smth · June 19, 2024, 10:39pm

Hey Max.

Looks like these functions were never properly documented, but the CPU implementations for these are pretty readable.

For both of these, the core gamma sampling function that they call is implemented here (lifted straight up from NumPy sources): pytorch/aten/src/ATen/native/Distributions.h at a6ac6447b55bcf910dee5f925c2c17673f162a36 · pytorch/pytorch · GitHub

Hope this helps.

maxwass · June 21, 2024, 5:59pm

Yes, this is very helpful! Thanks for diving into the belly of the beast for me…I would have never found these!

Can I assume the GPU implementations are in effect the ~same algorithms?

smth · June 21, 2024, 11:11pm

yes the gpu implementations are the same algorithms