Complex indexing to avoid for-loop on GPU

Hello,

I am trying to perform complex indexing to avoid using for-loops on GPU. My problem is as follows, I have a large matrix X = [B, N, H], a smaller matrix H = [B, N, M, H] (which is initialized with as zeros), and a list containing indices L, where each list has the coordinates l = [b, n, m], where B: batch size, N: number of tokens in sequence, M: number of sub_tokens - M is a subset of N, H: hidden dimension.

Specifically, I want to populate H with samples from X; however, not every N in H samples M times. Also there is no sampling across the batch B. Let me illustrate with an example.

X = tensor(

``````              [[[ 0,  1,  2,  3],
[ 4,  5,  6,  7],
[ 8,  9, 10, 11]],

[[12, 13, 14, 15],
[16, 17, 18, 19],
[20, 21, 22, 23]]]
``````

)

Index = [[0, 0, 0], [0, 0, 1], [0, 1, 0], [0, 2, 0], [0, 2, 1], [1, 0, 0], [1, 1, 1], [1, 2, 1], [1, 2, 2]]

H = tensor(

``````              [[[[0, 1, 2, 3],
[4, 5, 6, 7]],

[[0, 1, 2, 3],
[0, 0, 0, 0]],

[[0, 1, 2, 3],
[4, 5, 6, 7]]],

[[[12, 13, 14, 15],
[0, 0, 0, 0]],

[[16, 17, 18, 19],
[0, 0, 0, 0]],

[[16, 17, 18, 19],
[20, 21, 22, 23]]]])
``````

If anyone has an idea how to solve this that would be amazing!

Thank you,
Christoph

Hi,

I found a different solution that tackles the problem from a different angle and does not require complex indexing as mentioned in the post.

Thanks.