Best way to run unique_consecutive() on certain dimension?

galactica147 · February 23, 2021, 12:52am

What’s the best way to call unique_consecutive() on certain dimension of a tensor, and pad the left-out with specified values?
For simplicity, we could use 2D tensor as an example:

input = tensor([[3, 3, 5, 5, 5],
                [3, 3, 2, 2, 3]])

if specifying padding value -1, what i hope to get is:

output = tensor([[3, 5, -1, -1, -1],
                 [3, 2, 3, -1, -1]])

Also, to save memory, i’d prefer to do so in-place if possible.
Thanks!

Eta_C · February 23, 2021, 2:13am

Try

x = tensor([[3, 3, 5, 5, 5],
            [3, 3, 2, 2, 3]])
unique_x, indices = torch.unique_consecutive(x, return_inverse=True)
indices -= indices.min(dim=1, keepdims=True)[0]
result = -torch.ones_like(x)
result = result.scatter_(1, indices, x)

galactica147 · February 23, 2021, 7:52am

Thanks! This is working well!

Could you elaborate a bit what indices -= indices.min(dim=1, keepdims=True)[0] is doing?

This solution creates another tensor for the updated result, which is fine. I wonder if there is any in-place way that can save the extra memory cost.

Eta_C · February 23, 2021, 9:09am

See,

indices = [[0, 0, 1, 1, 1],
           [2, 2, 3, 3, 4]]

But here scatter needs col index.

indices = [[0, 0, 1, 1, 1],
           [0, 0, 1, 1, 2]]

I have no idea now.

galactica147 · February 23, 2021, 5:37pm

This is helpful! Thanks a lot!