Cumulative distribution function of a Tensor-CDF

amin_sabet · December 20, 2019, 11:31am

How can I get the cumulative density function of Tensor X which is evaluated at value V?

Here is the equivalent code in python.

ecdf = statsmodels.distributions.empirical_distribution.ECDF(X)
Val= ecdf(V)

edmondj · September 28, 2023, 11:57am

…Too bad, it is very useful for normalizing data.

NikolasMorshuis · October 2, 2023, 11:39am

I translated the ECDF function from statsmodels to PyTorch, hope it helps:

class ECDF(torch.nn.Module):
    def __init__(self, x, side='right'):
        super(ECDF, self).__init__()

        if side.lower() not in ['right', 'left']:
            msg = "side can take the values 'right' or 'left'"
            raise ValueError(msg)
        self.side = side

        if len(x.shape) != 1:
            msg = 'x must be 1-dimensional'
            raise ValueError(msg)

        x = x.sort()[0]
        nobs = len(x)
        y = torch.linspace(1./nobs, 1, nobs, device=x.device)

        self.x = torch.cat((torch.tensor([-torch.inf], device=x.device), x))
        self.y = torch.cat((torch.tensor([0], device=y.device), y))
        self.n = self.x.shape[0]

    def forward(self, time):
        tind = torch.searchsorted(self.x, time, side=self.side) - 1
        return self.y[tind]

Now you can use the ECDF function in PyTorch in the same way as the one in statsmodels:

x1 = torch.randn(1000)
x2 = torch.randn(1000)
ecdf_fn = ECDF(x1)
y = ecdf_fn(x2)