Function torch.max() return indices inconsistency between cup and gpu

pinglin · April 16, 2017, 12:28am

I encountered an inconsistent torch.max() behaviour when running it on cpu and gpu, which can be reproduced by:

import torch
x = torch.FloatTensor(2, 10, 10)
x[0, :, :] = 1
x[1, :, :] = 2
x[:, 3:7, 3:7] = 0
value, idx = torch.max(x, 0)
print(idx)

(0 ,.,.) =
1 1 1 1 1 1 1 1 1 1
1 1 1 1 1 1 1 1 1 1
1 1 1 1 1 1 1 1 1 1
1 1 1 0 0 0 0 1 1 1
1 1 1 0 0 0 0 1 1 1
1 1 1 0 0 0 0 1 1 1
1 1 1 0 0 0 0 1 1 1
1 1 1 1 1 1 1 1 1 1
1 1 1 1 1 1 1 1 1 1
1 1 1 1 1 1 1 1 1 1
[torch.LongTensor of size 1x10x10]

, and

value, idx = torch.max(x.cuda(), 0)
print(idx)

(0 ,.,.) =
1 1 1 1 1 1 1 1 1 1
1 1 1 1 1 1 1 1 1 1
1 1 1 1 1 1 1 1 1 1
1 1 1 1 1 1 1 1 1 1
1 1 1 1 1 1 1 1 1 1
1 1 1 1 1 1 1 1 1 1
1 1 1 1 1 1 1 1 1 1
1 1 1 1 1 1 1 1 1 1
1 1 1 1 1 1 1 1 1 1
1 1 1 1 1 1 1 1 1 1
[torch.cuda.LongTensor of size 1x10x10 (GPU 0)]

I supposed both the cpu and gpu output should be consistent?

smth · April 16, 2017, 4:26pm

this is an ambiguous case. In this case, both results are correct.
The CPU and GPU will return correct results but might not be consistent with each other when breaking ties).

Similar to max, you will see similar behavior when breaking ties in min, sort, topk, etc.

The reason it is hard to make CPU and GPU consistent is that if we need consistency then we will have to take a huge hit in GPU performance.

pinglin · April 16, 2017, 8:12pm

That makes sense. Thanks for the clarification.