# How to calculate the multivariate normal distribution using pytorch and math?

The mutivariate normal distribution is given as

The formula can be calculated using numpy for example the following way:

``````def multivariate_normal_distribution(x, d, mean, covariance):
x_m = x - mean
return (1.0 / (np.sqrt((2 * np.pi)**d * np.linalg.det(covariance))) * np.exp(-(np.linalg.solve(covariance, x_m).T.dot(x_m)) / 2))
``````

I want to do the same calculation but instead of using numpy I want to use pytorch and math. The idea is the following:

``````def multivariate_normal_distribution(x, d, mean, covariance):
x_m = x - mean
return (1.0 / (math.sqrt((2 * math.pi)**d * torch.linalg.det(covariance))) * torch.exp(-(np.linalg.solve(covariance, x_m).T.dot(x_m)) / 2))
``````

The problem is in `np.linalg.solve()`. I know that with `.solve` it is possible to solve a linear matrix equation or a system of linear scalar equations. So I want to know if there is a pytorch function that does exactly the same? Help is much appriciated

Did you try this?

``````from  torch.distributions import multivariate_normal
dist = multivariate_normal.MultivariateNormal(loc=torch.zeros(5), covariance_matrix=torch.eye(5))
probability = torch.exp(dist.log_prob(torch.randn(5)))

``````
2 Likes

Thanks a lot, I will try that. But do you know if there is a function in pytorch which does the same like `np.linalg.solve` in the above formula? Thank you very much!

Hi John!

torch.solve is pytorchâ€™s general linear-equation solver.

Your `covariance` should be symmetric positive definite, so, if you
cared, you could go the Cholesky- factorization route and use:

(You could also pre-compute the `covariance`'s inverse with
torch.inverse or torch.cholesky_inverse.)

But unless you have an unusual use case, I would go with Arulâ€™s
suggestion and use
`torch.distributions.multivariate_normal.MultivariateNormal`.

Best.

K. Frank

2 Likes

Hi @KFrank

so you mean, that it I have given a symmetric positive definite covariance, I could also write:

torch.exp(-(torch.cholesky_solve(covariance, x_m).T.dot(x_m)))

for the right term with the exponential?

Thank you very much for your help!

@InnovArul I have got still one question. Is the part for the Mahalanobis-distance in the formula you wrote:
`dist = multivariate_normal.MultivariateNormal(loc=torch.zeros(5), covariance_matrix=torch.eye(5))`
the same as

because in literature the Mahalanobis-distance is given with square root instead of -0.5 as a factor

Hi John!

Basically, but `torch.cholesky_solve()` takes the Cholesky factor matrix,
so you have to do it in two steps.

Here is an illustrative script:

``````import torch
torch.__version__

_ = torch.random.manual_seed (2020)

ma = torch.randn ((5, 5))
mspd = ma @ ma.T                           # make an SPD matrix
v = torch.randn ((5, 1))                   # make a vector
solveA = torch.solve (v, mspd)[0]          # solve direct
mcho = torch.cholesky (mspd)               # Cholesky factorization
solveB = torch.cholesky_solve (v, mcho)    # solve with Cholesky factorization
print ('solveA =', solveA)
print ('solveB =', solveB)
print ('solveA - solveB =', solveA - solveB)
``````

And here is its output:

``````>>> import torch
>>> torch.__version__
'1.6.0'
>>> _ = torch.random.manual_seed (2020)
>>> ma = torch.randn ((5, 5))
>>> mspd = ma @ ma.T                           # make an SPD matrix
>>> v = torch.randn ((5, 1))                   # make a vector
>>> solveA = torch.solve (v, mspd)[0]          # solve direct
>>> mcho = torch.cholesky (mspd)               # Cholesky factorization
>>> solveB = torch.cholesky_solve (v, mcho)    # solve with Cholesky factorization
>>> print ('solveA =', solveA)
solveA = tensor([[ 6.6744],
[-1.5211],
[ 1.1208],
[ 1.1884],
[-3.9746]])
>>> print ('solveB =', solveB)
solveB = tensor([[ 6.6743],
[-1.5211],
[ 1.1208],
[ 1.1884],
[-3.9745]])
>>> print ('solveA - solveB =', solveA - solveB)
solveA - solveB = tensor([[ 1.7643e-05],
[-4.2915e-06],
[ 2.1458e-06],
[ 3.0994e-06],
[-9.2983e-06]])
``````

Best.

K. Frank

1 Like

You can take a look at the source code here if it helps:

They calculate squared Mahalanobis distance in `_batch_mahalanobis()`.

1 Like

@KFrank I should have read your last post a little closer because you wrote â€śCholesky- factorization routeâ€ť and added both functions. Thank you for the helpful code snippet

So in my case I just have to write:

``````tmp = torch.cholesky(covariance)
res = torch.cholesky_solve(x_m, tmp)
``````

Does it matter if `cholesky_solve()` takes first the vector and then the matrix or the other way round? And what if x_m is a distance matrix where the entries are eucledian distances? I know that Mahalanobis-distances are there to calculate the distances between a data point and a distribution
but is it desired that we first calculate a Euclidean distance with x_m and then subsequently calculate the Mahalanobis distance?

Hey guys, thanks for the replies. In my case I have a single covariance matrix of a neighborhood (kernel) of a given pixel of an image and I need to compute the log_prob for all the pixels in the image using the neighborhood for them. i.e. for each pixel i,j I have to use a fixed and previously computed covariance matrix and mean to compute the multivariate normal log_prob that uses the neighbor pixels [iÂ±k,jÂ±k] being k the kernel size. One approach is to use torch.tensor.unfold to generate a view of the image tensor with the windows matching the kernel size, reshape and pass it to log_prob from the Multivariate normal. This approach works but it uses a lot of memory. Would anyone know a faster/better way to do that?