Hi there,
I would like to compute an operation similar to the the scatter_add in TensorFlow. I’ll report in the following a concrete example to explain what I want to achieve. In particular, given the following tensor whose dimension is (batch_size, seq_length):
ids = [
[1, 0, 0, 0],
[4, 5, 0, 0],
[10, 0, 0, 0]
]
scores = [
[10.0, 0.0, 0.0, 0.0],
[4.977129936218262, 5.0228705406188965, 0.0, 0.0],
[10.0, 0.0, 0.0, 0.0, ]
]
I want to use ids
so as to transfer the values in each row of scores
in the corresponding position of a bigger tensor whose dimension is (batch_size, total_size). Each element in ids
goes from 0 to total_size-1. Imagine that in this case total_size is 10 and batch_size is 3, we will get the tensor:
transformed_scores = [
[0.0, 10.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0],
[0.0, 0.0, 0.0, 0.0, 4.977129936218262, 5.0228705406188965, 0.0, 0.0, 0.0, 0.0],
[0.0, 10.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0]
]
I would like that this operation is differentiable so that I can backpropagate gradients for the intermediate representations.
Thanks!