# Sampling from non-identity covariance matrix, but not satisfying constraint PositiveDefinite()

I’m trying to use `torch.distributions.multivariate_normal` to sample from a non-identity covariance normal distribution. My covariance matrix is 4096x4096, and I have estimated it from data.

However, when I try and run `m = multivariate_normal.MultivariateNormal(torch.zeros(4096), cov)`, I get the following:

``````---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
Input In [11], in <cell line: 1>()
----> 1 m = multivariate_normal.MultivariateNormal(torch.zeros(4096), cov)

File /ext3/miniconda3/lib/python3.9/site-packages/torch/distributions/multivariate_normal.py:177, in MultivariateNormal.__init__(self, loc, covariance_matrix, precision_matrix, scale_tril, validate_args)
174 self.loc = loc.expand(batch_shape + (-1,))
176 event_shape = self.loc.shape[-1:]
--> 177 super().__init__(batch_shape, event_shape, validate_args=validate_args)
179 if scale_tril is not None:

File /ext3/miniconda3/lib/python3.9/site-packages/torch/distributions/distribution.py:68, in Distribution.__init__(self, batch_shape, event_shape, validate_args)
66         valid = constraint.check(value)
67         if not valid.all():
---> 68             raise ValueError(
69                 f"Expected parameter {param} "
70                 f"({type(value).__name__} of shape {tuple(value.shape)}) "
71                 f"of distribution {repr(self)} "
72                 f"to satisfy the constraint {repr(constraint)}, "
73                 f"but found invalid values:\n{value}"
74             )
75 super().__init__()

ValueError: Expected parameter covariance_matrix (Tensor of shape (4096, 4096)) of distribution MultivariateNormal(loc: torch.Size([4096]), covariance_matrix: torch.Size([4096, 4096])) to satisfy the constraint PositiveDefinite(), but found invalid values:
tensor([[1.0383, 0.9125, 0.6980,  ..., 0.4600, 0.6637, 0.8524],
[0.9125, 1.0187, 0.8847,  ..., 0.2892, 0.4774, 0.6710],
[0.6980, 0.8847, 0.9858,  ..., 0.1393, 0.2974, 0.4747],
...,
[0.4600, 0.2892, 0.1393,  ..., 0.9690, 0.8648, 0.6774],
[0.6637, 0.4774, 0.2974,  ..., 0.8648, 0.9972, 0.8932],
[0.8524, 0.6710, 0.4747,  ..., 0.6774, 0.8932, 1.0227]])
``````

The matrix is symmetric however - when I evaluate: `(cov==cov.T).sum()` I get `16777216` which is 4906**2. When I compute the eigenvalues however, using:

``````eigs=torch.linalg.eigvals(covmat)
min(eigs.real)
``````

I get a negative value: `tensor(-5.7518e-06)`. Perhaps because the matrix is large, there is some numerical error in the eigenvalue computation which is leading to a negative number, and therefore failing the pytorch checks? I notice when I subsample the matrix even slightly, to 4000x4000 instead of 4096, I am able to initialise the torch `MultivariateNormal` object without any errors.

Does anyone have a suggested workaround? Does it make sense for torch to use a numerically sensitive assertion here, when simply checking if the matrix is symmetric might also suffice?

Hi Chris!

This is pretty large, so you are likely asking for numerical issues.

Note that such an estimate will differ from the “true” population covariance.

I assume that you perform your estimate in a way that will give you a
mathematically positive-definite covariance matrix. Nonetheless, with
a matrix that size there is a good chance that it will be numerically
non-positive-definite.

This is possible. I think it more likely, though, that your matrix – which is a
floating-point approximation to your “true” matrix – does have a negative
eigenvalue. But, regardless, it’s more or less the same issue.

Slightly reducing the size of the matrix seems fragile. I could well believe
that a negative eigenvalue would show up again were you to run a slightly
different dataset.

You might be able to get away with performing this part of the computation
in double precision. But you might just end up again with a negative
eigenvalue, but with a value of about `-1.e-14`.

Without knowing your use case, it’s hard to say. What do you use the
samples you draw from `MultivariateNormal` for?

You might be able to “regularize” your covariance matrix by adding a
small “epsilon”, say, `1.e-5`, to its diagonal. You could explicitly reduce
the size of the matrix by keeping only the top 100 or 1000 or 2000
eigenvectors that correspond to the largest eigenvalues (“principal
components”).

You might also explore LowRankMultivariateNormal.

Best.

K. Frank