Hi there,
We have a tensor A
of size [ batch_size, beam_size, model_dim * n_steps ]
and a tensor positions
of size [ batch_size, beam_size ]
, and we want to build a tensor B
like this:
B[i][j] = A[i][positions[i][j]]
.
This is kind of like the index_copy_
function but which would handle several dimensions instead of only a vector of indices.

Is there a way to optimize this to run it as a tensor operation?

To give a bit of context, we’re trying to optimize this code, which is called within a
for
loop. Would such an operation as described above be faster?
Thanks in advance!
François