Instability of torch.inverse?

Raphikowski · January 18, 2020, 1:49pm

Dear all,

I am trying to do the following: Given I have some column vector of shape (5 x 1) I want to calculate the inverse of (q @ q.t()). This actually shouldn’t be such a difficult task, however, it seems that the inverse of this expression, torch.inverse(q @ q.t()) is way off:

In [358]: q = torch.randn(5,1)                                                                                                                                                                                 
In [359]: q                                                                                                                                                                                                    
Out[359]: 
tensor([[ 0.2144],
        [ 0.2899],
        [-1.0016],
        [-2.5544],
        [ 0.3549]])

In [360]: (q @ q.t()) @ torch.inverse(q @ q.t())                                                                                                                                                               
Out[360]: 
tensor([[ 0.2019, -1.0537,  0.2293, -0.3905, -0.7937],
        [-0.2614, -0.9735,  0.2418, -0.4921, -1.4343],
        [ 0.4410,  7.5348,  0.6504,  1.9400,  3.7739],
        [ 4.5718, 17.0569, -1.1591,  4.7328, 12.1593],
        [ 0.3166, -1.8443,  0.1681, -0.5732, -0.4462]])

So this calculation should obviously return something which is close to the identity. So is this some numerical issue?
If I just create a (5 x 5) matrix via w = torch.randn(5,5) and then calculate (w @ w.t()) @ torch.inverse(w @ w.t()) there is no issue and the result is close to the identity.
So is there something special by matrix multiplying a column vector with its transpose to obtain a matrix?

So maybe what I am missing is that q @ q.t() is often singular or close to singular? And hence the inverse is actually not defined?

Best regards

KFrank · January 18, 2020, 4:11pm

Hi Raphikowski!

Yes, your suspicion is correct.

By construction, your matrix is mathematically non-invertible.
(Numerically, it is likely technically invertible, but singular within
round-off error, so you get nonsense results when you try to invert
it numerically.)

Your matrix – by construction – is a projection matrix; that is, by
matrix-vector multiplication it projects any vector onto the vector q.
As such, it has only one non-zero eigenvalue, with corresponding
eigenvector (proportional to) q.

This is illustrated by the following (version 0.3.0) script:

import torch
torch.__version__

torch.manual_seed (2020)
q = torch.randn(5,1)
q   # your random vector
qq = q @ q.t()
qq   # your square "projection" matrix

vals, vecs = torch.eig (qq, True)

vals   # all but one zero eigenvalue

max, ind = vals[:,0].max (0)   # get index of non-zero eigenvalue
vec = vecs[:,ind]   # get non-zero eigenvector

q.div (vec)   #  q is (proportional to) the non-zero eigenvector

And here is the output:

>>> import torch
>>> torch.__version__
'0.3.0b0+591e73e'
>>>
>>> torch.manual_seed (2020)
<torch._C.Generator object at 0x000001F5A11D6630>
>>> q = torch.randn(5,1)
>>> q   # your random vector

 1.2372
-0.9604
 1.5415
-0.4079
 0.8806
[torch.FloatTensor of size 5x1]

>>> qq = q @ q.t()
>>> qq   # your square "projection" matrix

 1.5307 -1.1882  1.9072 -0.5046  1.0895
-1.1882  0.9224 -1.4805  0.3917 -0.8457
 1.9072 -1.4805  2.3763 -0.6287  1.3574
-0.5046  0.3917 -0.6287  0.1664 -0.3592
 1.0895 -0.8457  1.3574 -0.3592  0.7754
[torch.FloatTensor of size 5x5]

>>>
>>> vals, vecs = torch.eig (qq, True)
>>>
>>> vals   # all but one zero eigenvalue

 0.0000  0.0000
 5.7712  0.0000
 0.0000  0.0000
-0.0000  0.0000
-0.0000  0.0000
[torch.FloatTensor of size 5x2]

>>>
>>> max, ind = vals[:,0].max (0)   # get index of non-zero eigenvalue
>>> vec = vecs[:,ind]   # get non-zero eigenvector
>>>
>>> q.div (vec)   #  q is (proportional to) the non-zero eigenvector

 2.4023
 2.4023
 2.4023
 2.4023
 2.4023
[torch.FloatTensor of size 5x1]

>>>

Best.

K. Frank

Raphikowski · January 20, 2020, 12:33pm

Hi Frank,
great thank you very much for your answer!
So maybe another question then:
If I have a multivariate linear regression kind of problem of the form:
X @ b = y
with X being a (N x M) matrix, b being a (M x 1) column vector and y being a (N x 1) column vector. And I want to solve this now for X and not for b, what do I do to make this solution numerically tractable. Since my naive solutions would have been:
X = y @ b.t() @ (b @b.t())^-1
but here I obviously have the problem that I have to invert the projection matrix (b @ b.t())
So is there a common work around for that?
Best regards

KFrank · January 20, 2020, 4:25pm

Hello Raphikowski!

To be clear: Given vectors b and y, you wish to find a matrix X
such that X @ b = y.

The first thing to note is that there will be many such X. If Q is a
matrix that has y as an eigenvector (with eigenvalue one) and R
is a matrix that has b as an eigenvector (with eigenvalue one),
then Q @ X and X @ R will both also solve your equation, that is,
both (Q @ X) @ b = y and (X @ R) @ b = y.

One solution to your equation is given by

X = (y @ b.t()) / b.norm()**2

as illustrated by the following script:

import torch
torch.__version__

torch.manual_seed (2020)

# let M = 5 and N = 7, and generate random values for b and y
b = torch.randn(5,1)
y = torch.randn(7,1)
b
y

X = (y @ b.t()) / b.norm()**2
X

X @ b
# check result two ways
X @ b - y
X @ b / y

# try another random vector
c = torch.randn(5,1)
# multiplication by this X always gives a vector proportional to y
X @ c / y

And here is its output:

>>> import torch
>>> torch.__version__
'0.3.0b0+591e73e'
>>>
>>> torch.manual_seed (2020)
<torch._C.Generator object at 0x0000022EF2646630>
>>>
>>> # let M = 5 and N = 7, and generate random values for b and y
... b = torch.randn(5,1)
>>> y = torch.randn(7,1)
>>> b

 1.2372
-0.9604
 1.5415
-0.4079
 0.8806
[torch.FloatTensor of size 5x1]

>>> y

 0.0529
 0.0751
 0.4777
-0.6759
-2.1489
-1.1463
-0.2720
[torch.FloatTensor of size 7x1]

>>>
>>> X = (y @ b.t()) / b.norm()**2
>>> X

 0.0113 -0.0088  0.0141 -0.0037  0.0081
 0.0161 -0.0125  0.0201 -0.0053  0.0115
 0.1024 -0.0795  0.1276 -0.0338  0.0729
-0.1449  0.1125 -0.1805  0.0478 -0.1031
-0.4607  0.3576 -0.5740  0.1519 -0.3279
-0.2457  0.1908 -0.3062  0.0810 -0.1749
-0.0583  0.0453 -0.0727  0.0192 -0.0415
[torch.FloatTensor of size 7x5]

>>>
>>> X @ b

 0.0529
 0.0751
 0.4777
-0.6759
-2.1489
-1.1463
-0.2720
[torch.FloatTensor of size 7x1]

>>> # check result two ways
... X @ b - y

1.00000e-07 *
  0.0000
  0.0000
  0.2980
  0.0000
 -2.3842
  0.0000
  0.0000
[torch.FloatTensor of size 7x1]

>>> X @ b / y

 1.0000
 1.0000
 1.0000
 1.0000
 1.0000
 1.0000
 1.0000
[torch.FloatTensor of size 7x1]

>>>
>>> # try another random vector
... c = torch.randn(5,1)
>>> # multiplication by this X always gives a vector proportional to y
... X @ c / y

-0.2834
-0.2834
-0.2834
-0.2834
-0.2834
-0.2834
-0.2834
[torch.FloatTensor of size 7x1]

I will say, however, that I don’t really understand what you are trying
to do. There’s a lot of freedom in the matrix X – in fact, probably too
much freedom – so I suspect that what you are trying to here isn’t
really what you want to be doing.

Good luck!

K. Frank

Raphikowski · January 21, 2020, 8:07am

Dear Frank,

thank you so much for your efforts. I think you are right in saying that what I am trying to do is a dead end
Best regards!