Replace duplicate value of tensor based on scores from another

IjlalBaig · August 15, 2019, 8:43am

Hi,

Consider the following example:

idxs = [[0, 1, 2, 1], [3, 2, 1, 3]]
scores = [[0.5000, 0.8000, 0.5000, 0.5000], [0.5000, 0.5000, 0.5000, 0.9000]]

Within a batch of idxs, I would like to only retain values with the largest corresponding score entry and replace the rest with -1.

So I would expect an output:

out = [[ 0, 1, 2, -1], [ -1, 2, 1, 3]]

How can I achieve something like this?

Thanks

albanD · August 15, 2019, 10:01am

Hi,

Your output does not match your description I think.
Don’t you expect: [[ -1, 1, -1, -1], [-1, -1, -1, 3]] ?

IjlalBaig · August 15, 2019, 10:17am

Hi albanD,

For the first batch [0, 1, 2, 1], only 1s are duplicate, so the index with larger score remains and the other 1 is replaced by -1, similarly in the second batch [3, 2, 1, 3], 3s are duplicate, so only 3s with the smaller scores are replaced

albanD · August 15, 2019, 10:25am

Ho I missed the part about duplicates in your explanation.
I’m afraid you won’t have a specialized function to do this, so you will have to do it by handm checking each index.

IjlalBaig · August 15, 2019, 10:34am

Thanks for the reply.
I do have an idea of how I want to implement this, but it is limited to just one batch. If I flatten the all batches and then use that function, the search space for comparison will be too large; since each duplicate value will be compared with all the other values in flattened tensor. Is there away I can apply a function to all batches individually?

albanD · August 15, 2019, 10:37am

You can do a outer for loop and just look at input = full_input.select(0, batch_idx) every time.
Note that select() does not copy memory and so any inplace change of input will be reflected into full_input !

IjlalBaig · August 15, 2019, 10:41am

I was hoping to avoid using a for loop, but I guess it is inevitable. Thanks for all your help.