Having an input tensor, and another tensor selection, I want to produce the output tensor as the given below

eduardo4jesus · June 4, 2021, 2:34pm

Ok, I understood now what you want. I can’t see an elegant way to have a solution using plain PyTorch. The “naive” way to do it would be via a for loop.

First, let me redefine what I think you are asking:

Having an input tensor, and another tensor selection, I want to produce the output tensor as the given below.

input = torch.Tensor([2, 9, 7, 8, 4])
input 
# tensor([2., 9., 7., 8., 4.])

selection = torch.Tensor([0, 0, 1, 2, 3])
selection
# tensor([0., 0., 1., 2., 3.])

desired = torch.Tensor([[2, 9], [7, 0], [8, 0], [4, 0]])
desired
# tensor([[2., 9.],
#         [7., 0.],
#         [8., 0.],
#         [4., 0.]])

Explanation

selection has same size as input and contains the row in which each input data will be mapped into. If the row index repeats, then that next raw data input will be placed into a new column. All the other rows are padded with zero on that case.

The following assumption was not stated initially. But if it holds true, the code is more straight forward. Also, the provided example allow us to infer that.

Assumption: All rows indexes are represented in selection tensor.

As I said earlier, I can’t think of a way to solve it in an elegant manner, relying in short nested PyTorch calls. From what I know, I don’t think it is possible. But a solution can be definitely done by using a for loop.

solution using for loop.

unique, counters = selection.unique(return_counts=True)
n_rows, n_cols = int(torch.max(unique))+1, int(torch.max(counters.int()))

col = np.zeros((n_rows, ), dtype=int)
output = torch.zeros((n_rows, n_cols))

for data, row in zip(input, selection.tolist()):
  output[row, col[row]] = data
  col[row] += 1
  print(col)

output
# tensor([[2., 9.],
#         [7., 0.],
#         [8., 0.],
#         [4., 0.]])

PS: I guess this operation you are looking for can also be explained as a sort of group by