Non-deterministic Sorting

askliar · July 3, 2019, 10:22am

Hi!

I have a following piece of code:

sorted_values, indices = torch.sort(score, dim=0, descending=True)

score[filt_idxs] = float("-inf")
filt_values, filt_indices = torch.sort(score, dim=0, descending=True)

I have manually examined score tensor and it only has a single -inf value after filtering. That would mean that filt_indices might shift by at most one up compared to indices.
However, I see that in the cases where values of score tensor are the same, indices and filt_indices are completely different.

That behavior gets me thinking that sorting in PyTorch is non-deterministic. Is that so? If yes - how can I make it deterministic?

Extra details:

All the operations are on CPU
PyTorch 1.1.0

albanD · July 4, 2019, 2:45pm

Hi,

Could you find a small score Tensor that reproduces this behavior to help us reproduce the issue?

askliar · July 5, 2019, 6:21am

Hi,
Sure!

You can just try it with score = np.ones(10).

askliar · July 5, 2019, 6:26am

Also, I have realized that my choice of wording was incorrect.
It would be more proper to say that sorting in PyTorch is non-stable, which makes sense if internally sorting operation does quicksort.

However, it would be a really good feature if you could specify whether you need it to be stable or not by e.g. passing a flaf to sorting function. In this case it would internally run merge sort since it’s stable.

singleroc · September 27, 2019, 5:16am

Good proposal. I also meet a similar problem when translating PC side pytorch code to mobile side C++ code. In my scenario, counting sort is more efficient. However, the indices returned by pytorch is different with my counting sort when there are equal items (they all sort correctly). It would be nice if we could control the pytorch sort algorthim to keep result consistency.

albanD · September 27, 2019, 3:05pm

Hi,

We would be happy to accept PR implementing new sorting algorithms that can be chosen based on a flag (with the current behavior as default).