Adding a numerically stable inverse CDF of standard Gaussian

ClaraBing · December 18, 2020, 4:37pm

Hi,

The current icdf in torch.distributions.Normal seems to be numerically unstable compared to scipy.stats.norm.

I implemented a torch version by translating the scipy source code to python and it works the same as the scipy version. I wonder if we could integrate the new code to the current torch.distributions.Normal. Thank you!

ptrblck · December 19, 2020, 7:38am

Could you explain the numerical instability a bit? I.e. where is it observed and how scipy deals with it?
I’m sure a PR would be welcome, if the instability is reproducible or breaks certain use cases.

ClaraBing · December 19, 2020, 8:57pm

Numerical instability issues occur when the input to the inverse CDF is very close to 0 or 1.
scipy handles these cases with approximations and here is the screenshot of the comments in scipy’s source code:

One example is when y=1-1e-8:

torch.distributions.Normal(0, 1).icdf(torch.tensor(y)) returns tensor(inf) (because torch.erfinv returns tensor(inf));
scipy.stats.norm.ppf(y) returns 5.612001243305505.

ptrblck · December 20, 2020, 6:58am

Thanks for the update!
Would you like to create a feature request on GitHub and post the explanation there, so that we could discuss it there with the module owners and track it?
Btw. thanks for digging into this.

googlebot · December 20, 2020, 10:12am

that’s kinda related to float32 precision, as 1e-8 < 2**-23 = 1.19e-7

and
torch.erfinv(torch.tensor([1-2**-23])) → 3.7439

issue is this:

np.float32(1)-np.float32(1e-8)
1.0
np.frexp(np.float32(1)-np.float32(1e-8))
(0.5, 1)

ClaraBing · December 21, 2020, 3:59pm

I see, this is great, thank you @googlebot !!
Hi @ptrblck, I think there’s no need for a feature request since using DoubleTensor fixes the issue.

Thank you both for your help and suggestions~!