How to get the row index of specific values in tensor

isalirezag · October 26, 2018, 12:06am

Sorry for the stupid question, but i cannot find a fast way to solve my issue, so i thought maybe the experts here can help me with that or maybe pytorch has a function that already does this in a fast way.

I have a Tensor with size of BxRxC:
e.g.
here T has dimension of 1x3x4
T = torch.round(torch.rand(1,3,4)*10)
T =
6 8 10 8
2 4 7 2
5 0 4 1
Now i have another tensor (K) with way larger size, i know that tensor K includes values of each row of Tensor tensor T somewhere in it as well as other values, but i dont know where they are
e.g.
here K has dimension of 1x9x4
K = torch.cat((torch.round(torch.rand(1,3,4)*10),T, torch.zeros(1,3,4)),1)
K =
5 7 8 1
8 2 7 8
0 10 8 8
6 8 10 8
2 4 7 2
5 0 4 1
0 0 0 0
0 0 0 0
0 0 0 0

as we can see K has the values of T in row: 1,4, and 5
in terms of size B and C will always be the same in both T and K.
How I can get the row indexes in K that includes the values in T?

Also if I have another tensor D and lets say I have the indexes for the rows from last steps, how I can extract only the values in the rows of tensor D based on the indexes that i got, meaning that if D is:
D = torch.round(torch.rand(1,9,4)*10)
D =
2 6 8 7
3 3 9 9
4 4 4 4
2 7 5 2
3 1 9 7
3 4 4 7
1 5 2 1
3 7 1 7
5 9 8 10

I want the output be
O =
2 7 5 2
3 1 9 7
3 4 4 7

my output will be the same size as T,

P.s. I just multiplied the number with 10 to make it easier for reading purposes, they are not integer all the time.

tumble-weed · October 26, 2018, 4:01am

Get the difference between the common dimension of K and T
d = T.unsqueeze(2) - K.unsqueeze(1)
this will be of size (1,3,7,4). Where the rows were identical, we would have 0,0,0,0. So sum together the last dimension: dsum = d.sum(-1)
Now find out where dsum has zeros:
loc = (dsum==0).nonzero()
since all 3 rows were found somewhere in K, this will have size 3,3, if only 2 rows were found this would have a shape (2,3). You are interested in the locations inside K, so you need loc[:,-1]

Assuming D is same size as K to take out the relevant rows you’d do:
D[:,loc[:,-1],:]

isalirezag · October 26, 2018, 3:34pm

thanks a lot, it just had one minor proble, we should use dsum = torch.abs(d).sum(-1) instead of d.sum(-1), because the sum of numbers might lead to zero, although they are not all zeros.
see the following e.g.

    T = torch.round(torch.rand(1,3,4)*10)
    4 3 6 3
    1 0 5 8
    1 10 4 8
    K = torch.cat((torch.round(torch.rand(1,3,4)*10),T, torch.zeros(1,3,4)),1)
    0 9 1 9
    3 3 4 7
    9 3 1 3
    4 3 6 3
    1 0 5 8
    1 10 4 8
    0 0 0 0
    0 0 0 0
    0 0 0 0
    D = torch.round(torch.rand(1,9,4)*10)
    3 6 7 2
    4 0 5 9
    4 2 5 10
    4 4 9 2
    2 0 2 6
    2 1 4 0
    1 4 8 3
    4 3 8 0
    2 3 9 10
    d = T.unsqueeze(2) - K.unsqueeze(1)
    (0 ,0 ,.,.) =
    4 -6 5 -6
    1 0 2 -4
    -5 0 5 0
    0 0 0 0
    3 3 1 -5
    3 -7 2 -5
    4 3 6 3
    4 3 6 3
    4 3 6 3

    (0 ,1 ,.,.) =
    1 -9 4 -1
    -2 -3 1 1
    -8 -3 4 5
    -3 -3 -1 5
    0 0 0 0
    0 -10 1 0
    1 0 5 8
    1 0 5 8
    1 0 5 8

    (0 ,2 ,.,.) =
    1 1 3 -1
    -2 7 0 1
    -8 7 3 5
    -3 7 -2 5
    0 10 -1 0
    0 0 0 0
    1 10 4 8
    1 10 4 8
    1 10 4 8

    dsum = d.sum(-1)
    (0 ,.,.) =
    -3 -1 0 0 2 -7 16 16 16
    -5 -3 -2 -2 0 -9 14 14 14
    4 6 7 7 9 0 23 23 23
    [torch.FloatTensor of size 1x3x9]

    loc = (dsum==0).nonzero()
    0 0 2
    0 0 3
    0 1 4
    0 2 5
    loc[:,-1]
    2
    3
    4
    5
    D[:,loc[:,-1],:]
    (0 ,.,.) =
    4 2 5 10
    4 4 9 2
    2 0 2 6
    2 1 4 0
    [torch.FloatTensor of size 1x4x4]

isalirezag · October 26, 2018, 4:30pm

on another note, can you help me understand what loc = (dsum==0).nonzero() does?
so if the output of (dsum==0) is

  0  0  1  0  0  0
  0  0  0  1  0  0

then
(dsum==0).nonzero()
gives us

 0  0  2
 0  1  3

I understand that 2 and 3 are the indices, but what is 1 here?

ptrblck · October 26, 2018, 9:20pm

The result gives you a tensor containing indices for all nonzero occurrences in the shape [num_nonzeros, dims].
The first column (loc[:, 0]) gives the indices in dim0, the second one in dim1, etc.
As dsum has three dimensions, the second row stands for dsum[0, 1, 3].

tumble-weed · October 27, 2018, 2:47am

good catch. it should be sum of abs

Abhishek_Kumar2 · November 27, 2020, 6:01pm

The above code for determination of tensor index is very useful, lately I came across an anomaly when doing this operation.
This is function I am using for determining index of host tensor from target tensor.

def get_index(host, target):
        diff = target.unsqueeze(1) - host.unsqueeze(0)
        dsum = torch.abs(diff).sum(-1)
        loc = (dsum == 0).nonzero()
        return loc[:, -1]

for example I wanted to extract the index from 2D Tensor of shape (40,2). Such that Target[:,1] = 0 and 1 . This is the result I got:

tensor([[0.0000, 0.0000],
        [1.0000, 0.0000],
        [0.0000, 0.1111],
        [1.0000, 0.1111],
        [0.0000, 0.2222],
        [1.0000, 0.2222],
        [0.0000, 0.3333],
        [1.0000, 0.3333],
        [0.0000, 0.4444],
        [1.0000, 0.4444],
        [0.0000, 0.5556],
        [1.0000, 0.5556],
        [0.0000, 0.7778],
        [1.0000, 0.7778],
        [0.0000, 0.8889],
        [1.0000, 0.8889],
        [0.0000, 1.0000],
        [1.0000, 1.0000]])

The value of 0.6667 is missing from this output. Can anyone explain this abnormality or am I doing something wrong. @ptrblck could you please suggest anything.

ptrblck · November 28, 2020, 5:34am

You might be running into rounding errors due to the limited precision of floating point operations.
Try to compare the dsum to a small eps via dsum <= eps values instead of dsum == 0.

Abhishek_Kumar2 · November 28, 2020, 10:05am

Thanks a ton, it solved my problem