The target tensor is X (B, H, W, C). The index tensor is I (B, N, 2). And the source tensor is S (B, N, C).
So how can I implement X[i, I[i,j,0], I[i,j,1]] = S[i, j] efficiently and parallelly?
I tried to use
for loop to assign X. But it took too much time in
This code might work:
B, H, W, C = 2, 3, 3, 4
N = 5
x = torch.randn(B, H, W, C)
I = torch.randint(0, H, (B, N, 2))
S = torch.ones(B, N, C)
I[torch.arange(B), :, 0],
I[torch.arange(B), :, 1]] = S
Could you please double check it using your manual approach with the for loop?
PS: Please don’t create double posts, as other users might answer the same question
I have tried to change the index matrix
I to a new matrix with size
(B*N,3) including the batch index.
And this code works:
x[I[:,0], I[:,1], I[:2]] = S
However, your implementation is faster than mine.
And I have another question. If some indexes are repeating in
I, which value in
S would replace the value in
x? Is the latter one in
PS: This is the first time I post the problem on the forum. I’m sorry for my misoperation. Could you please delete another question? Thanks.
On the CPU it might be the last value. I think this behavior is undefined on the GPU.
Sure, we’ll delete the other post.
How to implement a similar task if the tensor is expended one more dimension?
X (B, S, H, W, C)
I (B, S, N, 2)
S (B, S, N, C)
The objective is X[i, j, I[i, j, k, 0], I[i, j, k, 1]] = S[i, j, k].
I think this method could work
I[torch.arange(B).unsqueeze(1), torch.arange(S), :, 0],
I[torch.arange(B).unsqueeze(1), torch.arange(S), :, 1]] = S