Hi all,
I have a problem performing values gathering for multiple values and would appreciate it if anyone has an idea that can solve it.
So, I have 2 tensors:
tgt - tokens (just numbers from a vocab)
probs - the probs for each corresponded token
I want to know the probability for each unique token.
But, each token can appear more than once in tgt, so we’ll have a few probs we need to aggregate. And we need this to happen for each token.
For example (with no batch dimension for simplicity):
tgt.shape (Batch_size X k)
probs.shape (Batch_size X k)
tgt = [3, 7, 3, 5, 11, 11, 11]
probs = [-0.2, -0.4, -0.5, -0.8, -0.1, -0.7, -0.2]
# We want the summed probs:
summed_probs = [-0.7, -0.4, -0.7, -0.8, -1.0, -1.0, -1.0]
# OR
summed_probs = [(3, -0.7),(7, -0.4), (5, -0.8), (11, -1.0)]
Right now I’m doing it with an iterative code using Numba, but I would like to get a tensor operation solution.
Thanks !