Is there any unique function which keeps the order of occurence?

Hello,

I have two questions regarding the unique function:

1- I figured out that in Pytorch documentation for unique function when sorted is not provided according to the example given in the documentation the default of it is false:

output = torch.unique(torch.tensor([1, 3, 2, 3], dtype=torch.long))
output
tensor([ 2, 3, 1])

However, in my computer I get:
torch.unique(torch.tensor([1, 3, 2, 3], dtype=torch.long))
tensor([1, 2, 3])

, which means that the default value of sorted=True. Where does this inconsistency come from?

2- Moreover, I figured out that the output of

torch.unique(torch.tensor([1, 3, 2, 3], dtype=torch.long), sorted=False)
is
tensor([2, 3, 1])

, which means that it doesn’t even keep the order of occurrence. How can I have a unique function that keeps track of the first occurrence of elements? I mean, with the above example, the output should be:
tensor([1, 3, 2])

2 Likes

Hi,

I have tested your cases and it seems you are right.
Clearly, docs say that if one specifies dim then whether sorted=True or sorted=False the tensor will be sorted even though default values is None. And I think because of that, even though the default value of sorted=False it stills sorts the tensor.
Due to this forced sorting, I think it losses the original order of values.

This is another related post around this issue:

Also, in the referenced post, you can find an approach to get the correct order but it is not built-in.

Bests

Hello, Thank you for the response.

Clearly, docs say that if one specifies dim then whether sorted=True or sorted=False the tensor will be sorted even though default values is None . And I think because of that, even though the default value of sorted=False it stills sorts the tensor.

But I haven’t declared the dim and the default value of dim is None, and I am working on cpu. So, I don’t understand this behavior.

I think that is the bug, and in the other post I have referenced, this issue has been stated that it acts differently from what it supposed to do, literally skips the sorted argument in GPU implementation.
In CPU mode, order is not correct.

Also, in the source code, there are two types of implementation for unique in both CPU and GPU which acts differently w.r.t. value of ndim but currently it has not been fixed and the issue on github page is still open.