Any way of filtering given rows from a tensor A?

Kukai_Nakahata · June 2, 2020, 8:59am

I have a matrix A and its a 2 x n tensor and it represents a collection of n points with coordinates (x, y). I have a separate tensor B that is also 2 x m, (n>m) and all points in B is in A, I want a tensor C that is the collection of points in A that is not in B (set difference).

in regular python, if we have a list of points and if we want to find the difference we do the following:
C = set(A) - set(B)

I want to know if there’s a method of doing this on the GPU, I know how to do a one dimensional tensor with a bool tensor and a mask ( non-intersection of 1d tensor ). but if I put in a two dimensional tensor, then the output gets flatten and loses the dimensionality of the points

My code so far:

def set_differ(a, b):                                                                                                                                                                                                    
     A = torch.tensor(a, device = 'cuda')                                                                                                                                                                            
     B = torch.tensor(b, device = 'cuda')                                                                                                                                                                                                                                                                                                                           
     size = A.size()                                                                                                                                                                                         
     # bool_vec is a 1d mask that tells if the point stored at that index is included in the difference or not
     bool_vec = torch.ones(1, size[0], dtype = torch.bool, device="cuda")                                                                                                                                         
     for point in B: 
            # update bool_vec if the point (x, y) is in A, change the value at that index to False                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                              
            bool_vec = bool_vec & (A != point).any(1)                                                                                                                                                                             
     # stack the bool_vec so that it is the same size as A
     bool_vec = torch.cat([bool_vec]*size[1], 0)                                                                                                                                                                 
     set_difference = A[torch.t(bool_vec)]

inputs and outputs:

A = tensor([[ 1,  1],
        [10,  6],
        [ 4,  5],
        [ 9,  8],
        [ 0,  6],
        [ 7,  1],
        [17, 16]], device='cuda:0')
B = tensor([[1, 1],
        [7, 1]], device='cuda:0')

desired output = tensor([10,  6],
        [ 4,  5],
        [ 9,  8],
        [ 0,  6],
        [17, 16]], device='cuda:0')
actual output: tensor([10,  6,  4,  5,  9,  8,  0,  6, 17, 16], device='cuda:0')

aRI0U · June 2, 2020, 9:25am

I think that the easiest solution is to use list comprehensions to compute a mask. The following piece of code seems to do what you want:

import torch

A = torch.tensor([1,1,10,6,4,5,9,8,0,6,7,1,17,16]).view(-1,2).cuda()
B = torch.tensor([1,1,7,1]).view(-1,2).cuda()

print('A:\n', A)
print('B:\n', B)

def set_differ(A, B):
  r"""Returns the elements of A without the elements of B"""
  mask = torch.as_tensor([elem not in B for elem in A])
  return A[mask]

print('out:')
set_differ(A, B)

returns

A:
 tensor([[ 1,  1],
        [10,  6],
        [ 4,  5],
        [ 9,  8],
        [ 0,  6],
        [ 7,  1],
        [17, 16]], device='cuda:0')
B:
 tensor([[1, 1],
        [7, 1]], device='cuda:0')
out:

tensor([[10,  6],
        [ 4,  5],
        [ 9,  8],
        [ 0,  6],
        [17, 16]], device='cuda:0')

EDIT:

Another solution would be to use the fact that a point x from A also belongs to B iff its closest point in B is at distance 0. Then you can compute the pointwise distance between points from A and B to filter them.

def set_differ2(A, B):
  cdist = torch.cdist(A.float(), B.float())
  min_dist = torch.min(cdist, dim=1).values
  return A[min_dist>0]

In an algorithmic point of view, that second solution should be slower, however thanks to some CUDA magic it seems to be faster with your example:

%timeit set_differ(A, B)
%timeit set_differ2(A, B)

returns

1000 loops, best of 3: 752 µs per loop
1000 loops, best of 3: 259 µs per loop

I don’t know which solution is faster for larger point clouds, so I suggest you to try both methods with bigger values for m and n.

Kukai_Nakahata · June 2, 2020, 9:45am

That is exactly what I wanted. Why is list Comp allowed in this situation? I just started pytorch so maybe I’m wrong but I thought any list comprehension would be running on the CPU. If not, how is it running on the GPU?

aRI0U · June 2, 2020, 10:04am

I don’t know if list comprehensions are actually forbidden when dealing with tensors in GPU, but yes they actually are computed on CPU, so it is often a better idea to vectorize than to use list comprehensions if you can. That could explain why the second method is faster than the first one.
However note than with the first method the list comprehension only builds a Python list of Python booleans, so there are no issues with lists of GPU tensors or stuff like that.

shwinshaker · June 12, 2022, 7:01pm

I believe when A contains many more elements than B, it should be faster to loop over B and leave the comparison with A to parallel computation. Hope this alternative solution helps others.

def check_one(tensor, element):
    return 1 - torch.prod(tensor == element, dim=1)

def set_differ(A, B):
    is_in = torch.prod(torch.stack([check_one(A, e) for e in B]), dim=0)
    return A[is_in.nonzero().ravel()]