Torch.inverse vs torch.cholesky_inverse - when to use which?

AlphaBetaGamma96 · November 5, 2021, 12:48pm

Hi All,

I just have a quick question regarding inverting matrices, is there any convention on which one to used? Is it just to use torch.cholesky_inverse for PSD matrices and torch.inverse for all other cases? Because I’ve used both for a PSD matrix and they give different results? Is this just numerical noise?

Thank you!

KFrank · November 5, 2021, 2:18pm

Hi Alpha!

Yes, because (if you know your matrix is symmetric-positive-definite)
you can use cholesky_inverse() and doing so will be faster.

Yes, because in the non-SPD case cholesky_inverse() won’t work.

You don’t say how large the discrepancy is, so my default assumption
is that you are seeing floating-point round-off error. An easy test for
this is to repeat the same computations using float64 tensors. If the
discrepancy is reduced by several orders of magnitude, you are most
likely seeing round-off error.

Best.

K. Frank

AlphaBetaGamma96 · November 5, 2021, 4:53pm

Hi @KFrank,

Here’s an example,

>>> import torch
>>> 
>>> A = torch.randn(3,3)
>>> M = A@A.transpose(-2,-1)
>>> 
>>> M
tensor([[ 1.6559, -0.9484,  0.6780],
        [-0.9484,  1.2809, -0.0119],
        [ 0.6780, -0.0119,  0.5705]])
>>> torch.cholesky_inverse(M)
tensor([[ 1.0665,  0.3376, -1.2417],
        [ 0.3376,  0.6097,  0.0284],
        [-1.2417,  0.0284,  3.0724]])
>>> torch.inverse(M)
tensor([[ 5.9354,  4.3301, -6.9640],
        [ 4.3301,  3.9398, -5.0643],
        [-6.9640, -5.0643,  9.9240]])
>>> 
>>> M=M.double()
>>> M
tensor([[ 1.6559, -0.9484,  0.6780],
        [-0.9484,  1.2809, -0.0119],
        [ 0.6780, -0.0119,  0.5705]], dtype=torch.float64)
>>> torch.cholesky_inverse(M)
tensor([[ 1.0665,  0.3376, -1.2417],
        [ 0.3376,  0.6097,  0.0284],
        [-1.2417,  0.0284,  3.0724]], dtype=torch.float64)
>>> torch.inverse(M)
tensor([[ 5.9354,  4.3301, -6.9640],
        [ 4.3301,  3.9398, -5.0643],
        [-6.9640, -5.0643,  9.9240]], dtype=torch.float64)

One thing I’ve just checked is multiplying the inverse by M to get back the identity matrix, but torch.cholesky_inverse doesn’t give back I but torch.inverse does.

>>> iM_c = torch.cholesky_inverse(M)
>>> iM_i = torch.inverse(M)
>>> 
>>> iM_c @ M
tensor([[ 6.0391e-01, -5.6433e-01,  1.0679e-02],
        [ 3.1153e-17,  4.6053e-01,  2.3788e-01],
        [ 4.1231e-16,  1.1777e+00,  9.1056e-01]], dtype=torch.float64)
>>> iM_i @ M
tensor([[ 1.0000e+00,  4.3730e-16,  5.4966e-16],
        [-2.7408e-16,  1.0000e+00,  2.3931e-16],
        [ 6.0420e-16, -6.4525e-16,  1.0000e+00]], dtype=torch.float64)

I assume I’m using torch.cholesky_inverse incorrectly and should be using torch.inverse instead?

Thank you for your help!

Edit: So, after re-reading the docs you need to apply torch.cholesky before and call torch.cholesky_inverse to that, rather than applying it directly! (I naively thought it was called internally, torch.cholesky_inverse — PyTorch 1.10.0 documentation )