Some questions about torch.unique()

bulubulubu-Yu · July 31, 2019, 3:52am

Hello, I have some questions about torch.unique().

For a cpu tensor:

input = torch.tensor([[2,5,7,6],[9,7,4,8],[1,3,2,3],[2,5,7,6]])

input                                                      
tensor([[2, 5, 7, 6],
        [9, 7, 4, 8],
        [1, 3, 2, 3],
        [2, 5, 7, 6]])

torch.unique(input)
tensor([3, 1, 8, 4, 9, 6, 7, 5, 2])

torch.unique(input, sorted=True)
tensor([1, 2, 3, 4, 5, 6, 7, 8, 9])

torch.unique(input, dim=0)
tensor([[1, 3, 2, 3],
        [2, 5, 7, 6],
        [9, 7, 4, 8]])

torch.unique(input, sorted=True, dim=0)
tensor([[1, 3, 2, 3],
        [2, 5, 7, 6],
        [9, 7, 4, 8]])

torch.unique(input, sorted=False, dim=0)
tensor([[1, 3, 2, 3],
        [2, 5, 7, 6],
        [9, 7, 4, 8]])

I note that for torch.unique the default sorted=True in official documentation (https://pytorch.org/docs/stable/_modules/torch/functional.html#unique). However, when using torch.unique(input), the output is unsorted. When I use torch.unique(input, sorted=False, dim=0), the output is still sorted.

Furthermore, I also test a gpu tensor, and find all the output results are sorted regardless of whether setting sorted=True or False.

Why? There seems something wrong about it?

ptrblck · July 31, 2019, 10:45am

Depending on the underlying implementation the tensor is sorted e.g. for performance reasons.
All CUDA tensors are sorted by default, due to limitations in thrust.

bulubulubu-Yu · August 1, 2019, 3:00am

So the only special case is that when input is a cpu tensor and the dim arg is None, output is unsorted by default. Right?

ptrblck · August 1, 2019, 10:08am

If I haven’t missed a code path, I think you are right.

bulubulubu-Yu · August 1, 2019, 11:08am

OK, gotcha. Thanks for your reply!