`torch.linalg.lu_factor` with `pivot=False` doesn't return `LU` matrix?


I have a matrix K that I represent with an LU factorization, such that K = L @ U, and L and U are lower and upper triangular 2-D tensors respectively. I’m currently testing the functionality of torch.linalg.lu_factor with pivot=False, which enforces P (the permutation matrix) to be the identity.
In my sample code I have the following:

L = torch.randn(3,3).tril()
L.diagonal()[:] = torch.ones(3) # To enforces 1's in the diagonal
U = torch.randn(3,3).triu()
K = L @ U

If then I do:
LU_factor, pivots = torch.linalg.lu_factor(K, pivot=False)
I would expect LU_factor to be the same matrix as K, but it’s not. Moreover, if I do:
Pu, Lu, Uu, = torch.lu_unpack(LU_factor, pivots)
Then Lu and Uu are now exactly L and U (as expected).

My question is, if by setting pivot=False in lu_factor I enforce P to be the identity, what other operations do lu_factor() and lu_unpack() perform under the hood that make LU_factor be different from L @ U?

Thanks beforehand!

Hi Marc!

Even when you turn off pivoting, lu_factor() still performs the non-trivial
LU factorization and returns the “packed” factorization in LU_factor. So
LU_factor is not the original, unfactored K. (Why would it be?)

Here is a script that illustrates this by constructing the “packed” factorization
from your original L and U matrices:

import torch
print (torch.__version__)

_ = torch.manual_seed (2023)

L = torch.randn (3, 3, device = 'cuda').tril()
L.diagonal()[:] = torch.ones (3, device = 'cuda')   # To enforces 1's in the diagonal
U = torch.randn (3, 3, device = 'cuda').triu()
K = L @ U

LU_factor, pivots = torch.linalg.lu_factor (K, pivot=False)

tind = torch.triu_indices (3, 3)                    # "pack" factorization into a single square matrix
LU_pack = L.clone()
LU_pack[tind[0], tind[1]] = U[tind[0], tind[1]]

print ('LU_factor = ...')
print (LU_factor)
print ('LU_pack = ...')
print (LU_pack)
print ('allclose:', torch.allclose (LU_factor, LU_pack))

And here is its output:

LU_factor = ...
tensor([[-0.4913,  0.5382, -3.1042],
        [ 0.7900, -0.2999,  0.2209],
        [ 1.1008, -0.1755,  0.5342]], device='cuda:0')
LU_pack = ...
tensor([[-0.4913,  0.5382, -3.1042],
        [ 0.7900, -0.2999,  0.2209],
        [ 1.1008, -0.1755,  0.5342]], device='cuda:0')
allclose: True


K. Frank

Oh my bad! I thought LU_factor would be L@U and not L and U packed together (which makes a lot of sense now that you mention it, for many reasons). Thanks for the very quick reply.