# Row-wise comparisons between 2D-tensors

Hi everyone!

I’m trying to compare all row-elements of 2 2D tensors. An easy example of would be the following two tensors

``````a = torch.tensor([[1,2], [4,5], [7,8]])
b = torch.tensor([[2,3], [7,5], [-1,7]])
``````

Now I’d like to check for each element in the first tensor if it is part of the same row in the second tensor. My expected result would be

``````[
[False, False] (1 vs [2, 3])
[True, False] ((2 vs [2, 3)
[False, False] (4 vs  [7,5])
[False, True] (5 vs  [7,5])
[False, True] (7 vs  [-1,7])
[False, False] (8 vs  [-1,7])
]
``````

Does anyone have any idea how to solve this efficiently?

Thanks a lot!

It certainly isn’t an efficient way memory-wise, but you might check, if it would yield a speed up compute-wise:

``````a = torch.tensor([[1,2], [4,5], [7,8]])
b = torch.tensor([[2,3], [7,5], [-1,7]])

ret = a.view(-1, 1, 1) == b
idx = torch.arange(3).unsqueeze(1).expand(-1, 2).reshape(-1)
print(ret[torch.arange(ret.size(0)), idx])
> tensor([[False, False],
[ True, False],
[False, False],
[False,  True],
[False,  True],
[False, False]])
``````
1 Like

That is very helpful, thank you very much!

Another implementation:

``````res = a.repeat_interleave(2, dim=1).reshape(-1, 2) == b.repeat_interleave(2, dim=0)
``````
1 Like

Thanks a lot for this, cool to see that there are so many possibilities to solve this problem!

Eta_Cs solution seems to be quite a bit faster for large tensors (shape [10000,2]):

``````N=10000
a = torch.rand([N,2])
b = torch.rand([N,2])

from timeit import default_timer as timer
start = timer()
idx = torch.arange(N).unsqueeze(1).expand(-1, 2).reshape(-1)
for _ in range(500):
ret = a.view(-1, 1, 1) == b

res = ret[torch.arange(ret.size(0)), idx]
end = timer()
print(end - start)

start2 = timer()
for _ in range(500):
res = a.repeat_interleave(2, dim=1).reshape(-1, 2) == b.repeat_interleave(2, dim=0)
end2 = timer()
print(end2 - start2)
``````

121.2189056
0.11425909999999817

hi, hope you’re doing well
I have 2 tensors with unequal size

a = torch.tensor([[8,2], [5,3],[4,4]])
b = torch.tensor([[1,2],[5,3]])

I want a boolean tensor of whether each value exists in the other tensor without iterating. something like
a in b
and then we should have

[False, True, False]

This should work:

``````a = torch.tensor([[8,2], [5,3],[4,4]])
b = torch.tensor([[1,2],[5,3]])

res = (a.unsqueeze(0) == b.unsqueeze(1)).all(dim=2).any(dim=0)
print(res)
# > tensor([False,  True, False])
``````

The first `all(dim=2)` operation makes sure that all elements of the rows match while the `any(dim=0)` operation checks if any of the rows have matches the corresponding row in `a`.

Hi, I was looking for the same thing and came up with a similar solution. However, could this approach cause huge memory consumption if the tensors involved are large? If yes, is there any other possible solution that consumes few memory and does not require the use of loops? Thanks!

Yes, the memory usage could be large since you are broadcasting the tensors and need to calculate the intermediates. Using loops would have a lower memory footprint, but could be slower. Your best bet might be to write a custom C++/CUDA operation for your use case and check if you could get a proper speedup without a large memory requirement.

Hi, happy new year…wish you a happy and healthy year
I have 2 tensors:
tensor([ 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 1,
1, 1, 1, 1, 1, 1, 1, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 3,
3, 3, 3, 3, 3, 4, 4, 4, 5, 5, 5, 5, 6, 6, 6, 6, 7, 7,
7, 7, 8, 8, 8, 8, 8, 9, 9, 10, 10, 10, 11, 12, 12, 13, 13, 13,
13, 13, 14, 14, 15, 15, 16, 16, 17, 17, 18, 18, 19, 19, 19, 20, 20, 21,
21, 22, 22, 23, 23, 23, 23, 23, 24, 24, 24, 25, 25, 25, 26, 26, 27, 27,
27, 27, 28, 28, 28, 29, 29, 29, 29, 30, 30, 30, 30, 31, 31, 31, 31, 31,
31, 32, 32, 32, 32, 32, 32, 32, 32, 32, 32, 32, 32, 33, 33, 33, 33, 33,
33, 33, 33, 33, 33, 33, 33, 33, 33, 33, 33, 33])
and
tensor([ 1, 2, 3, 4, 5, 6, 7, 8, 10, 11, 12, 13, 17, 19, 21, 31, 0, 2,
3, 7, 13, 17, 19, 21, 30, 0, 1, 3, 7, 8, 9, 13, 27, 28, 32, 0,
1, 2, 7, 12, 13, 0, 6, 10, 0, 6, 10, 16, 0, 4, 5, 16, 0, 1,
2, 3, 0, 2, 30, 32, 33, 2, 33, 0, 4, 5, 0, 0, 3, 0, 1, 2,
3, 33, 32, 33, 32, 33, 5, 6, 0, 1, 32, 33, 0, 1, 33, 32, 33, 0,
1, 32, 33, 25, 27, 29, 32, 33, 25, 27, 31, 23, 24, 31, 29, 33, 2, 23,
24, 33, 2, 31, 33, 23, 26, 32, 33, 1, 8, 32, 33, 0, 24, 25, 28, 32,
33, 2, 8, 14, 15, 18, 20, 22, 23, 29, 30, 31, 33, 8, 9, 13, 14, 15,
18, 19, 20, 22, 23, 26, 27, 28, 29, 30, 31, 32])
and I have one more tensor which name is “a” and has the size of 34*34.
I wanna access to some, but not all, elements of “a” based on the two previous tensors…
for example I need a[0][1], a[0][2] , a[0][3], a[0][4], a[0][5], a[0][6], a[0][7], a[0][8] but I don’t need a[0][9] because 9 is not in the second tensor and again I need a[1][2], a[1][3] , a[1][7] but I don’t need a[1][4] because 4 is not in the second tensor…

Direct indexing should work:

``````x = torch.tensor([0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 1,
1, 1, 1, 1, 1, 1, 1, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 3,
3, 3, 3, 3, 3, 4, 4, 4, 5, 5, 5, 5, 6, 6, 6, 6, 7, 7,
7, 7, 8, 8, 8, 8, 8, 9, 9, 10, 10, 10, 11, 12, 12, 13, 13, 13,
13, 13, 14, 14, 15, 15, 16, 16, 17, 17, 18, 18, 19, 19, 19, 20, 20, 21,
21, 22, 22, 23, 23, 23, 23, 23, 24, 24, 24, 25, 25, 25, 26, 26, 27, 27,
27, 27, 28, 28, 28, 29, 29, 29, 29, 30, 30, 30, 30, 31, 31, 31, 31, 31,
31, 32, 32, 32, 32, 32, 32, 32, 32, 32, 32, 32, 32, 33, 33, 33, 33, 33,
33, 33, 33, 33, 33, 33, 33, 33, 33, 33, 33, 33])

y = torch.tensor([1, 2, 3, 4, 5, 6, 7, 8, 10, 11, 12, 13, 17, 19, 21, 31, 0, 2,
3, 7, 13, 17, 19, 21, 30, 0, 1, 3, 7, 8, 9, 13, 27, 28, 32, 0,
1, 2, 7, 12, 13, 0, 6, 10, 0, 6, 10, 16, 0, 4, 5, 16, 0, 1,
2, 3, 0, 2, 30, 32, 33, 2, 33, 0, 4, 5, 0, 0, 3, 0, 1, 2,
3, 33, 32, 33, 32, 33, 5, 6, 0, 1, 32, 33, 0, 1, 33, 32, 33, 0,
1, 32, 33, 25, 27, 29, 32, 33, 25, 27, 31, 23, 24, 31, 29, 33, 2, 23,
24, 33, 2, 31, 33, 23, 26, 32, 33, 1, 8, 32, 33, 0, 24, 25, 28, 32,
33, 2, 8, 14, 15, 18, 20, 22, 23, 29, 30, 31, 33, 8, 9, 13, 14, 15,
18, 19, 20, 22, 23, 26, 27, 28, 29, 30, 31, 32])

a = torch.randn(34, 34)
ret = a[x, y]

reference = []
for x_, y_ in zip(x, y):
reference.append(a[x_, y_])
reference = torch.stack(reference)

print((ret == reference).all())
# > tensor(True)
``````

Hi, hope you’re doing well…
I have a datasets and split it in to train_mask and test_mask…

from sklearn.model_selection import train_test_split

then I 've used

You should be able to check it via `x.sum(dim=1).unique().size()`.