Sampling from an arbitrary pdf in pytorch

Subho · September 9, 2019, 7:17pm

Is there a fast and efficient way in PyTorch to sample a vector (of potentially large dimension), given only its pdf in closed-form? In my case, it is intractable to analytically compute the inverse of the cdf.

KFrank · September 9, 2019, 9:40pm

Hi Subho!

Well, it depends, of course, on your probability density function …

You say nothing about it other than that you have a closed-form
expression for it.

Given that, pre-compute the following:

Analytically or numerically compute its integral to get the
cumulative distribution function. You say that you cannot invert
this analytically, so invert it numerically, storing the (pre-computed)
inverse as a look-up / interpolation table. (The granularity and
interpolation scheme for the interpolation table will depend on
the accuracy required of your inverted cumulative distribution
function.)

Then on a tensor (vector) basis, generate a bunch of uniform
deviates and pump them through (on a tensor basis) your inverse
cumulative distribution function. You now have a tensor (vector)
of samples from your original probability density function.

Good luck!

K. Frank

Subho · September 11, 2019, 7:27pm

Turns out that I can use simple rejection sampling in my case. Thanks anyways.

KFrank · September 12, 2019, 12:08am

Hello Subho!

Rejection sampling doesn’t parallelize naturally. You would
therefore typically run the rejection-sampling loop individually
for each element of the vector you want to populate with random
deviates, thereby forgoing the benefits of using pytorch’s tensor
operations.

Best.

K. Frank