Upper Triangular Matrix Vectorization

Hi guys, does PyTorch has the function that would return me the vectorized upper triangular matrix?
For example, I have Tensors as [ [1, 2, 3], [4, 5, 6], [7, 8, 9]], and I want [1, 2, 3, 5, 6, 9]. I know numpy has triu_indices, which will return me the indices of upper triangular matrix, and use the index I can easily get the elements I want from numpy matrix. but it seems that I can’t do that in PyTorch.
Thanks!

1 Like

I don’t think that there’s a api for directly returning the vectorized upper triangular matrix, but you can still achieve that:

>>> a = torch.arange(1, 10).view(3, 3)
>>> a
 1 2 3
 4 5 6
 7 8 9
[torch.FloatTensor of size 3x3]

>>> triu_indices = a.triu().nonzero().transpose()
>>> triu_indices
 0 0 0 1 1 2
 0 1 2 1 2 2
[torch.LongTensor of size 2x6]

>>> vectorized_upper_triangular_matrix = a[triu_indices[0], triiu_indices[1]]
>>> vectorized_upper_triangular_matrix
1
2
3
5
6
9
[torch.FloatTensor of size 6]

Maybe there’s a better way that I’m not aware of.

5 Likes

Note that this only works when there are no zeros in the upper triangular part.

It would be cool if we could get more support for this in core pytorch. Also important is the opposite – going from a vectorization to an upper/lower triangular matrix. I haven’t been able to find a clean way to do this yet.

3 Likes

Here’s another way that should work more generally:

In [180]: x = torch.randn(3, 3)

In [181]: x
Out[181]: 

 1.6177  0.5082  1.3817
-0.5801 -0.4657  0.8086
-0.4783 -0.3208  1.4459
[torch.FloatTensor of size 3x3]

In [182]: x[torch.triu(torch.ones(3, 3)) == 1]
Out[182]: 

 1.6177
 0.5082
 1.3817
-0.4657
 0.8086
 1.4459
[torch.FloatTensor of size 6]

And you can do the reverse similarly,

In [183]: vec = x[torch.triu(torch.ones(3, 3)) == 1]

In [184]: y = torch.zeros(3, 3)

In [185]: y[torch.triu(torch.ones(3, 3)) == 1] = vec

In [186]: y
Out[186]: 

 1.6177  0.5082  1.3817
 0.0000 -0.4657  0.8086
 0.0000  0.0000  1.4459
[torch.FloatTensor of size 3x3]
3 Likes

Although this doesn’t seem to work on Variables:

RuntimeError: can't assign Variable to a torch.FloatTensor using a mask (only torch.FloatTensor or float are supported)

You could do this with a mask

def tril_mask(value):
    n = value.size(-1)
    coords = value.new(n)
    torch.arange(n, out=coords)
    return coords <= coords.view(n, 1)

which is used as

>>> value = torch.arange(9).view(3,3)
>>> value[tril_mask(value)]
 0
 3
 4
 6
 7
 8
[torch.FloatTensor of size 6]
1 Like

Trick is to use numpy itself in torch without hurting the backpropgration.
For x as a 2D tensor this works for me:

import numpy  as np

row_idx, col_idx = np.triu_indices(x.shape[1])
row_idx = torch.LongTensor(row_idx).cuda()
col_idx = torch.LongTensor(col_idx).cuda()
x = x[row_idx, col_idx]

For 3D tensor (assuming first dimension is batch):

row_idx, col_idx = np.triu_indices(x.shape[2])
row_idx = torch.LongTensor(row_idx).cuda()
col_idx = torch.LongTensor(col_idx).cuda()
x = x[:, row_idx, col_idx]

Note that the whole process is still differentiable in pytorch.

2 Likes

Hello , I have the exact opposite issue !

I have a vector with n*(n-1)/2 elements . (the elements of an upper triangular matrix matrix without the main diagonal)

I want to assign the vector into an upper triangular matrix (n by n) and still keep the whole process differentiable in pytorch.

I have tried : mat[np.triu_indices(n, 1)] = vector

1 is the offset besause i dont use the main diagonal.

and I get the RuntimeError: a leaf Variable that requires grad has been used in an in-place operation.

Any suggestions without using in-place operations ?

2 Likes

No, but I have a suggestion without requiring grad on a leaf variable :slight_smile::

n = 5
mat = torch.zeros(5, 5)
vector = torch.randn(10, requires_grad=True)
mat[numpy.triu_indices(n, 1)] = vector
mat.sum().backward()
vector.grad

gives the expected vector of 10 ones. So mat doesn’t require_grad when it is instantiated, but only indirectly when you assign elements requiring grad to it.
That works.

Best regards

Thomas

4 Likes

Oh cool idea. thank you Tom !

@tom
I understood and reproduced your code snippet.

I have an issue though in integrating it in a custom layer.

My goal is to compute gradients for Q (3x3 matrix) !

My problem is that I cannot compute gradients (None is shown)

I suspect there is something wrong with the computational graph or an inplace operation that I am missing.

Here is a sample of my custom layer: (I tried to keep it clear and simple :slightly_smiling_face: )

class AdaLayer(torch.nn.Module):

def __init__(self, Li):
    super(AdaLayer, self).__init__()
    self.Q = nn.Parameter(torch.rand((3, 3), dtype=torch.float))
    self.M = nn.Parameter(torch.mm(self.Q, self.Q.t()), requires_grad=True)  #symmetric
    self.W = None
    self.mat = None
    self.linear1 = torch.nn.Linear(re_size*re_size*3, out_size)

def forward(self, img_flat, uniq_dif):
    dim = img_flat.shape[0]
    self.M = nn.Parameter(torch.mm(self.Q, self.Q.t()))

    vector = Variable( torch.diag(  torch.mm( torch.mm(uniq_dif, self.M) , uniq_dif.t())), requires_grad=True )

    # as discussed
    self.mat = torch.zeros(dim, dim)
    self.mat[np.triu_indices(dim, 1)] = vector

    # auto-gradient reaches up to here !
    self.W = nn.Parameter(torch.exp(-1*(torch.sqrt((self.mat + self.mat.t())))/2), requires_grad=True)
    y = (torch.mm(self.W, img_flat)).t().contiguous()
    print(y.shape)
    y = self.linear1(y.view(1,-1))
    return y

Output:

------------------Q 3x3 gradient----------------------
None
------------------M symmetric positive semidefinite gradient----------------------
None
------------------ mat gradient----------------------
None
------------------W gradient----------------------

tensor([[-1.7305e-05, -1.1781e-05, -1.3871e-05, …, -5.0380e-06,
-1.9574e-05, -6.8704e-06],
[ 1.3375e-06, 1.0680e-06, 1.2287e-06, …, 5.4148e-07,
1.3086e-06, 6.5805e-07],
[-3.6731e-06, -2.9817e-06, -3.5238e-06, …, -1.0123e-06,
-2.3968e-06, -1.4685e-06],
…, etc…

Any extra help would be greatly appreciated. ( I am new in pytorch :stuck_out_tongue: )

Kind regards ,
black0017

One couldn’t reproduce this, as it won’t run (triple backticks “```” around the code comment would help the formatting, too).
nn.Parameter breaks the computational graph and should only be used for model state, not for computed variables, so does Variable(...). It is almost certainly an error to instantiate new nn.Parameter that in forward.

Best regards

Thomas

@tom Thanks a lot your insights helped me a lot !

Have a nice day !
black0017