Index_copy_ on several dimensions

Hi there,

We have a tensor A of size [ batch_size, beam_size, model_dim * n_steps ] and a tensor positions of size [ batch_size, beam_size ], and we want to build a tensor B like this:
B[i][j] = A[i][positions[i][j]].

This is kind of like the index_copy_ function but which would handle several dimensions instead of only a vector of indices.

  1. Is there a way to optimize this to run it as a tensor operation?

  2. To give a bit of context, we’re trying to optimize this code, which is called within a for loop. Would such an operation as described above be faster?

Thanks in advance!