Adding the tensor values based on indices

I have a tensor of values and a corresponding tensor of indices. I want to sum all the values where the index is not a special token (tokenizers).

For instance:


values=tensor([1.0000, 0.1574, 0.1507, 0.2520, 0.2456, 0.2365, 0.2330, 0.2294, 0.2321,
         0.2339], grad_fn=<MaxBackward0>),
indices=tensor([32099,    12,    24,     3,     3,     3,     3,     3,     3,     3])

I want sum(0.1574, 0.1507)
What is the optimal way to achieve this?

You could most likely create a mask using the non-special indices, apply it to the tensor, and sum it.
I’m not sure, how the “spacial” indices are defined and thus how complicated it would be to “find” them.

According to your description and my assume, I can use two steps to solve it.

  1. mask indices
  2. select value
    we assume that id 3 and 32099 are the special token id.
>>> values
tensor([1.0000, 0.1574, 0.1507, 0.2520, 0.2456, 0.2365, 0.2330, 0.2294, 0.2321,
        0.2339])
>>> indices
tensor([32099,    12,    24,     3,     3,     3,     3,     3,     3,     3])
>>> mask = ~torch.eq(~torch.eq(indices, 3), torch.eq(indices, 32099))
>>> mask
tensor([0, 1, 1, 0, 0, 0, 0, 0, 0, 0], dtype=torch.uint8)
>>> torch.masked_select(values, mask)
tensor([0.1574, 0.1507])
>>> torch.sum(torch.masked_select(values, mask))
tensor(0.3081)