Why torch.mode return different value between CPU and GPU

jie.hang · December 6, 2022, 6:03am

For this case,

tensor_input = torch.Tensor([[[[-0.6572, -0.0377,  0.2676, -0.2568, -1.6279, -0.3259, -0.1349,
           -0.6699,  1.0273,  0.0203, -0.7080,  0.9360,  0.8535,  0.0132,
            1.2920, -0.4414, -0.5073, -0.5352,  0.2313,  0.1196, -0.7681,
           -0.9087, -0.4175, -0.0583, -1.1299,  1.5000,  0.0756,  0.4622,
           -0.5273,  1.7432, -0.8896,  1.7295, -0.7310, -0.7080, -0.0253,
            0.7202,  2.2656,  1.2324,  1.0000,  0.8584, -3.2207,  0.0425,
           -1.3242, -0.0217,  0.2297, -0.3833, -0.0539,  1.2920, -0.6719,
            0.3425,  1.4785,  0.6108,  0.5913, -1.3027, -1.0791]]]])
value_gpu, index_gpu = torch.mode(tensor_input.cuda(), -1, False)
value_cpu, index_cpu = torch.mode(tensor_input.cpu(), -1, False)
print("value_gpu is:  ", value_gpu)
print("value_cpu is: ", value_cpu)

the running result is:

value_gpu is:   tensor([[[1.2920]]], device='cuda:0')
value_cpu is:  tensor([[[-0.7080]]])

For the input tensor, 1.2920 and -0.7080, both values above appear twice, why CPU return -0.7080, while GPU return 1.2920 ?

If all point values appear only once, CPU and GPU will return the minimum, why partial point value appear twice, CPU still return the minimum, while GPU will not return the minimum?

I wonder if this phenomenon is reasonable?

ptrblck · December 6, 2022, 7:02am

Your input tensor contains unique values besides the repetition of the values 1.2920 and -0.7080, which are also returned in both approaches. I don’t think there is a guarantee to return the smaller, first, or any specific value if the occurrence of multiple values is equal.
If you depend on another logic, you could create a feature request on GitHub and explain your use case.

jie.hang · December 6, 2022, 8:39am

What do you mean if the occurrence of multiple values is equal, multiple values will return ? But there is only one return value, and the CPU and GPU are different, which one is correct?

ptrblck · December 6, 2022, 9:04am

Both are correct as both occur two times in the tensor (the max. number of times) and there is thus no guarantee to prefer one correct output over the other.
As already mentioned, if you have a use case to prefer one of both valid outputs, feel free to create a feature request explaining your use case and the need for it.

jie.hang · December 6, 2022, 9:54am

I guess if the occurrence of multiple values is equal, the implementation of the cpu should return the minimum value through many experiments.
But the GPU implementation(compute_mode kernel), does not consider how to prefer one of both valid outputs.
I think it is more friendly to be consistent with the CPU results, so that other GPU hardware manufacturers can verify the function of torch.mode.
For example, when the input shape is [3, 2, 100, 100], there are more values are different between CPU and GPU, even between NVIDIA GPU and AMD GPU.

np_input = np.random.randn(3, 2, 100, 100).astype(np.float32)
tensor_input = torch.from_numpy(np_input)

output_gpu,output_gpu_index = torch.mode(tensor_input.cuda(),-1, False)
output_cpu,output_cpu_index = torch.mode(tensor_input.cpu(),-1, False)
common.compare_cpu_gpu((output_cpu.numpy(),output_cpu_index.numpy()), (output_gpu.cpu().numpy(),output_gpu_index.cpu().numpy()), kwargs)

result is:

============================== output_cpu ==============================
[[[-0.708    0.3645  -2.512    1.184   -2.219   -2.139    0.547
   -0.2683   1.272    1.036   -2.133   -0.807    0.359    0.695
   -1.915   -2.498   -0.78    -2.51    -0.7046  -2.797   -1.994
   -0.8647   0.3787  -1.875   -2.68    -2.74    -2.229   -2.498
   -1.894    1.238   -0.8066   1.601   -2.549   -3.203   -0.868
   -2.148   -2.197   -1.214   -1.853   -2.137    0.2764  -1.053
   -1.961   -2.367   -3.088   -1.113   -2.193   -1.92    -0.5586
   -2.09     2.066   -1.304   -2.852   -2.475   -0.6724  -2.297
   -2.504   -0.7593  -2.305   -1.7     -0.896   -2.654   -2.926
    0.5054  -2.729   -2.373   -2.01    -1.346   -2.434   -2.314
   -2.762   -0.524   -2.102   -2.47    -2.547   -2.492   -1.787
   -1.046   -1.133   -2.447   -2.504   -2.422   -0.2793  -2.924
   -2.28    -0.856   -1.172   -2.586   -1.163   -3.305   -2.188
   -2.223   -1.931   -0.5737   0.5664  -2.594   -2.465   -2.404
   -2.379   -2.082  ]
  [-0.9033   0.515   -2.84    -1.249   -2.787    1.056    1.161
    1.038   -2.09    -1.246   -0.787   -2.197    0.361    0.941
   -3.064   -2.373   -2.74    -1.007   -1.001   -2.48    -1.052
   -3.367   -2.084   -2.51    -2.402    0.877   -3.322   -1.827
   -3.125   -3.459   -1.04    -2.479   -2.207   -3.682   -2.32
   -1.977    0.5273  -2.215   -2.092   -1.254   -2.184   -2.652
   -2.256   -0.6074  -2.098   -2.678   -1.565   -2.71    -0.529
   -0.686   -3.045    0.6406  -3.164   -1.041   -2.576   -1.431
   -2.129   -2.322   -0.3386   0.619   -2.71    -0.9155  -1.529
   -2.559   -1.946   -1.304    0.747    0.4854  -2.053   -2.297
   -0.926   -3.232   -2.518    1.807   -2.416   -0.1096  -3.105
   -3.205   -1.204   -0.651   -2.08     0.561   -2.662   -3.041
   -2.215   -3.447    1.1045   0.6143  -2.344   -2.404   -1.256
   -1.987    0.5283  -2.72    -2.363   -0.702    0.947   -2.602
   -1.407   -3.156  ]]

 [[-2.627   -2.229   -1.101   -1.871    0.871   -0.2842  -2.951
    1.247   -2.467   -0.8125   0.352   -2.002   -2.154   -1.003
   -3.04    -0.823   -0.4363  -0.84    -2.37    -2.74    -2.328
   -0.8135  -0.705   -3.234    0.5425   0.7524  -2.062   -0.6357
   -0.11743 -1.124   -2.773   -2.291    1.596   -2.617   -0.6826
   -0.873    0.1764  -2.398   -0.3223   1.065   -3.07    -3.254
   -2.29    -2.65    -2.115   -2.273   -2.54    -2.066   -0.456
   -0.1404  -1.241   -1.074   -2.47    -2.066   -0.6665  -2.531
   -2.273   -0.946   -2.318    0.367   -4.098   -0.4797  -2.58
   -1.014    0.7666  -0.835   -3.713   -3.344   -2.326   -2.562
   -1.414   -1.146   -0.6465   0.5444  -1.208    0.4094  -2.51
   -2.77     0.1414  -3.008    0.6826  -2.385   -2.953   -0.502
   -2.424   -2.197   -1.989   -0.441   -2.09     0.4082  -0.709
   -2.129   -1.254   -1.759    1.334   -1.438   -2.39    -2.09
    1.46    -1.554  ]
  [-2.738   -2.47    -2.469   -2.768   -1.021   -0.2361  -0.0923
   -2.506    0.9956  -2.02    -1.187    0.6045  -3.16     1.228
   -2.375   -2.691   -3.281   -2.605    0.8843   0.8545  -0.1605
   -3.656   -2.053   -1.918   -2.027   -2.695   -1.4     -0.3152
   -0.773    0.1968   0.6445   0.2365  -2.846   -1.693    0.7827
   -1.438   -2.88    -0.6787   0.621   -2.746    0.601   -0.733
   -1.182   -0.503    2.01    -2.26    -1.864   -0.7935   1.062
    0.3706  -0.304   -0.2499  -2.23    -3.34    -2.475    0.647
   -2.604   -1.      -2.008   -1.347   -2.182   -2.541   -0.2952
   -2.49    -1.096   -0.892    0.176   -0.456   -2.338   -1.758
   -1.81    -2.652   -2.166   -0.5977  -0.517   -2.531   -3.27
    0.6865  -2.344   -3.064   -2.234   -1.076   -2.635   -2.396
   -1.297   -2.625   -0.5967  -2.115   -2.8     -3.13     1.354
   -3.287   -1.853    1.541   -2.602   -2.078    0.9824  -2.797
   -1.146   -0.723  ]]

 [[-2.66    -1.101    2.014   -4.25     1.186   -2.146   -2.818
    0.995   -3.514   -2.473   -2.488    0.848    0.3997  -2.787
   -2.936   -0.616    0.3792  -0.4143  -2.67    -2.62    -2.16
   -2.578    1.483   -2.928    1.241   -2.75    -1.106   -2.033
    0.219   -2.752   -0.2747  -0.5703  -1.722   -0.8306  -0.5273
   -2.484    1.029   -2.29    -1.621   -0.9863  -2.504   -0.2712
   -2.703   -2.791   -2.871   -1.55    -0.4758  -2.568   -0.994
    1.418    0.7886  -3.111   -0.2546   1.019    1.23     0.725
   -2.844    0.784   -2.816   -2.42    -0.5747  -3.21    -2.35
   -2.158   -3.057   -0.678   -2.404   -1.638   -2.28    -1.104
   -1.469    0.355    0.9062  -0.535    0.4365  -2.676   -2.574
   -2.174   -0.587   -1.186   -2.523   -0.4897  -2.592    0.6562
   -2.553    1.059   -2.602   -2.504   -2.934    0.8833   0.752
   -3.08    -1.385   -3.07    -1.261    1.774    1.024   -1.045
   -2.402   -3.365  ]
  [-2.643   -1.076   -2.826   -2.756   -3.254   -0.6377  -2.348
    0.656   -1.126   -3.639   -2.275   -2.545    0.6074  -0.725
   -2.098    0.4253  -1.243   -2.547   -2.879    0.566   -2.273
   -2.254    1.595    0.661   -0.9087  -1.2     -2.79    -1.828
   -0.586    1.735   -1.102   -2.412   -2.213   -0.815    1.112
    0.2179   0.783   -2.37    -1.526   -2.852   -2.885   -2.53
    0.9546  -1.129   -0.1569  -2.635   -3.508    0.533   -1.696
   -2.305   -2.123   -1.69     0.5894   0.7915  -1.984    0.7715
   -2.883   -2.383   -2.06    -1.198   -2.402   -2.465   -2.885
   -3.064   -2.348   -2.494   -0.2703  -2.277   -2.723   -2.463
   -2.158   -2.404   -2.36     0.11926 -2.254    0.2847  -2.797
   -3.326   -1.008   -1.781   -2.055   -2.707   -2.67    -0.579
   -2.31     0.9688  -0.831   -2.334    1.27    -2.34    -2.152
   -1.514   -2.877   -3.268   -2.182   -2.182    0.6895  -3.01
   -3.115   -1.88   ]]]
============================== output_gpu ==============================
[[[-0.708    0.3645  -2.512    1.184   -2.219   -2.139    0.547
   -0.2683   1.272    1.036   -2.133   -0.807    0.359    0.695
   -1.915   -2.498   -0.78    -2.51    -0.7046  -2.797   -1.994
   -0.8647   0.3787  -1.875   -2.68    -2.74    -2.229   -2.498
   -1.894    1.238   -0.8066   1.601   -2.549   -3.203   -0.868
   -2.148   -2.197   -1.214   -1.853   -2.137    0.2764  -0.4294
   -1.961   -2.367   -3.088   -1.113   -2.193   -1.92    -0.5586
   -2.09     2.066   -1.304   -2.852   -2.475   -0.6724  -2.297
   -2.504   -0.7593  -2.305   -1.7     -0.896   -2.654   -2.926
    0.5054  -2.729   -2.373   -2.01    -1.346   -2.434   -2.314
   -2.762   -0.524   -2.102   -2.47    -2.547   -2.492   -1.787
   -1.046   -1.133   -2.447   -2.504   -2.422   -0.2793  -2.924
   -2.28    -0.856   -1.172   -2.586   -1.163   -3.305   -2.188
   -2.223   -1.931   -0.5737   0.5664  -2.594   -2.465   -2.404
   -2.379   -2.082  ]
  [-0.9033   0.515   -2.84    -1.249   -2.787    1.056    1.161
    1.038   -2.09    -1.246   -0.787   -2.197    0.361    0.941
   -3.064   -2.373   -2.74    -1.007   -1.001   -2.48    -1.052
   -3.367   -2.084   -2.51    -2.402    0.877   -3.322   -1.827
   -3.125   -3.459   -1.04    -2.479   -0.2683  -3.682   -2.32
   -1.977    0.5273  -2.215   -2.092   -1.254   -2.184   -2.652
   -2.256   -0.6074  -2.098   -2.678   -1.565   -2.71    -0.529
    0.2197  -3.045    0.6406  -3.164   -0.73    -2.576   -1.431
   -2.129   -2.322   -0.3386   0.619   -2.71    -0.9155  -1.529
   -2.559   -1.946   -1.304    0.747    0.4854  -2.053   -2.297
   -0.926   -3.232   -2.518    1.807   -2.416   -0.1096  -3.105
   -3.205   -1.204   -0.651   -2.08     0.561   -2.662   -3.041
   -2.215   -3.447    1.1045   0.6143  -2.344   -2.404   -1.256
   -1.987    0.5493  -2.72    -2.363   -0.702    0.947   -2.602
   -1.407   -3.156  ]]

 [[-2.627   -2.229   -0.2322  -1.871    0.871   -0.2842  -2.951
    1.247   -2.467   -0.8125   0.352   -2.002   -2.154   -1.003
   -3.04    -0.823   -0.4363  -0.84    -2.37    -2.74    -2.328
   -0.8135  -0.705   -3.234    0.5425   0.7524  -2.062   -0.6357
   -0.11743 -1.124   -2.773   -2.291    1.596   -2.617   -0.6826
   -0.873    0.1764  -2.398   -0.3223   1.065   -3.07    -3.254
   -2.29    -2.65    -2.115   -2.273   -2.54    -2.066   -0.456
   -0.1404  -1.241   -0.851   -2.47    -2.066   -0.6665  -2.531
   -2.273   -0.946   -2.318    0.367   -4.098   -0.4797  -2.58
   -1.014    0.7666  -0.835   -3.713   -3.344   -2.326   -2.562
   -0.62    -1.146   -0.6465   0.5444  -1.208    0.4094  -2.51
   -2.77     0.1414  -3.008    0.6826  -2.385   -2.953   -0.502
   -2.424   -2.197   -1.989   -0.441   -2.09     0.4082   0.2576
   -2.129   -1.254   -1.759    1.334   -1.438   -2.39    -2.09
    1.46    -1.554  ]
  [-2.738   -2.47    -2.469   -2.768   -1.021   -0.2361  -0.0923
   -2.506    0.9956  -2.02    -0.7153   0.6045  -3.16     1.228
   -2.375   -2.691   -3.281   -2.605    0.8843   0.8545  -0.1605
   -3.656   -2.053   -1.918   -2.027   -2.695   -1.4     -0.1771
   -0.773    0.1968   0.6445   0.2365  -2.846   -1.693    0.7827
   -1.438   -2.88    -0.6787   0.621   -2.746    0.601   -0.733
   -1.182   -0.503    2.01    -2.26    -1.864   -0.7935   1.062
    0.3706  -0.304   -0.2499  -2.23    -3.34    -2.475    0.647
   -2.604   -1.      -2.008   -1.347   -2.182   -2.541   -0.2952
   -2.49    -1.096   -0.892    0.176   -0.456   -2.338   -1.758
   -1.81    -2.652   -2.166   -0.5977  -0.517   -2.531   -3.27
    0.6865  -2.344   -3.064   -2.234   -1.076   -2.635   -2.396
   -1.297   -2.625   -0.5967  -2.115   -2.8     -3.13     1.354
   -3.287   -1.853    1.541   -2.602   -2.078    0.9824  -2.797
   -1.146   -0.723  ]]

 [[-2.66    -1.101    2.014   -4.25     1.186   -2.146   -2.818
    1.121   -3.514   -2.473   -2.488    0.848    0.3997  -2.787
   -2.936   -0.616    0.3792  -0.4143  -2.67    -2.62    -2.16
   -2.578    1.483   -2.928    1.241   -2.75    -1.106   -2.033
    0.219   -2.752   -0.2747  -0.5703  -1.722   -0.8306  -0.5273
   -2.484    1.029   -2.29    -1.621   -0.9863  -2.504   -0.2712
   -2.703   -2.791   -2.871   -1.55    -0.4758  -2.568   -0.7886
    1.418    0.7886  -3.111   -0.2546   1.019    1.23     1.676
   -2.844    0.784   -2.816   -2.42    -0.5747  -3.21    -2.35
   -2.158   -3.057   -0.678   -2.404   -1.638   -2.28    -1.104
   -1.469    0.355    0.9062  -0.535    0.4365  -2.676   -2.574
   -2.174   -0.587   -1.186   -2.523   -0.4897  -2.592    0.6562
   -2.553    1.059   -2.602   -2.504   -2.934    0.8833   0.752
   -3.08    -1.385   -3.07    -1.261    1.774    1.024   -1.045
   -2.402   -3.365  ]
  [-2.643   -1.076   -2.826   -2.756   -3.254   -0.6377  -2.348
    0.656   -1.126   -3.639   -2.275   -2.545    0.6074  -0.725
   -2.098    0.4253  -1.243   -2.547   -2.879    0.566   -2.273
   -2.254    1.595    0.661   -0.9087  -1.2     -2.79    -1.828
   -0.586    1.735   -1.102   -2.412   -2.213   -0.815    1.112
    0.2179   0.783   -2.37    -1.526   -2.852   -2.885   -2.53
    0.9546   0.1125  -0.1569  -2.635   -3.508    1.453   -1.696
   -2.305   -2.123   -1.69     0.5894   0.7915  -1.984    0.7715
   -2.883   -2.383   -2.06    -1.198   -2.402   -2.465   -2.885
   -3.064   -2.348   -2.494   -0.2703  -2.277   -2.723   -2.463
   -2.158   -2.404   -2.36     0.11926 -2.254    0.2847  -2.797
   -3.326   -1.008   -1.781   -2.055   -2.707   -2.67    -0.579
   -2.31     0.9688  -0.831   -2.334    1.27    -2.34    -2.152
   -1.514   -2.877   -3.268   -2.182   -2.182    0.6895  -3.01
   -3.115   -1.88   ]]]
============================== diff       ==============================
[[[0.     0.     0.     0.     0.     0.     0.     0.     0.     0.
   0.     0.     0.     0.     0.     0.     0.     0.     0.     0.
   0.     0.     0.     0.     0.     0.     0.     0.     0.     0.
   0.     0.     0.     0.     0.     0.     0.     0.     0.     0.
   0.     0.623  0.     0.     0.     0.     0.     0.     0.     0.
   0.     0.     0.     0.     0.     0.     0.     0.     0.     0.
   0.     0.     0.     0.     0.     0.     0.     0.     0.     0.
   0.     0.     0.     0.     0.     0.     0.     0.     0.     0.
   0.     0.     0.     0.     0.     0.     0.     0.     0.     0.
   0.     0.     0.     0.     0.     0.     0.     0.     0.     0.    ]
  [0.     0.     0.     0.     0.     0.     0.     0.     0.     0.
   0.     0.     0.     0.     0.     0.     0.     0.     0.     0.
   0.     0.     0.     0.     0.     0.     0.     0.     0.     0.
   0.     0.     1.938  0.     0.     0.     0.     0.     0.     0.
   0.     0.     0.     0.     0.     0.     0.     0.     0.     0.906
   0.     0.     0.     0.311  0.     0.     0.     0.     0.     0.
   0.     0.     0.     0.     0.     0.     0.     0.     0.     0.
   0.     0.     0.     0.     0.     0.     0.     0.     0.     0.
   0.     0.     0.     0.     0.     0.     0.     0.     0.     0.
   0.     0.     0.021  0.     0.     0.     0.     0.     0.     0.    ]]

 [[0.     0.     0.868  0.     0.     0.     0.     0.     0.     0.
   0.     0.     0.     0.     0.     0.     0.     0.     0.     0.
   0.     0.     0.     0.     0.     0.     0.     0.     0.     0.
   0.     0.     0.     0.     0.     0.     0.     0.     0.     0.
   0.     0.     0.     0.     0.     0.     0.     0.     0.     0.
   0.     0.2231 0.     0.     0.     0.     0.     0.     0.     0.
   0.     0.     0.     0.     0.     0.     0.     0.     0.     0.
   0.794  0.     0.     0.     0.     0.     0.     0.     0.     0.
   0.     0.     0.     0.     0.     0.     0.     0.     0.     0.
   0.967  0.     0.     0.     0.     0.     0.     0.     0.     0.    ]
  [0.     0.     0.     0.     0.     0.     0.     0.     0.     0.
   0.4712 0.     0.     0.     0.     0.     0.     0.     0.     0.
   0.     0.     0.     0.     0.     0.     0.     0.1381 0.     0.
   0.     0.     0.     0.     0.     0.     0.     0.     0.     0.
   0.     0.     0.     0.     0.     0.     0.     0.     0.     0.
   0.     0.     0.     0.     0.     0.     0.     0.     0.     0.
   0.     0.     0.     0.     0.     0.     0.     0.     0.     0.
   0.     0.     0.     0.     0.     0.     0.     0.     0.     0.
   0.     0.     0.     0.     0.     0.     0.     0.     0.     0.
   0.     0.     0.     0.     0.     0.     0.     0.     0.     0.    ]]

 [[0.     0.     0.     0.     0.     0.     0.     0.126  0.     0.
   0.     0.     0.     0.     0.     0.     0.     0.     0.     0.
   0.     0.     0.     0.     0.     0.     0.     0.     0.     0.
   0.     0.     0.     0.     0.     0.     0.     0.     0.     0.
   0.     0.     0.     0.     0.     0.     0.     0.     0.2056 0.
   0.     0.     0.     0.     0.     0.9507 0.     0.     0.     0.
   0.     0.     0.     0.     0.     0.     0.     0.     0.     0.
   0.     0.     0.     0.     0.     0.     0.     0.     0.     0.
   0.     0.     0.     0.     0.     0.     0.     0.     0.     0.
   0.     0.     0.     0.     0.     0.     0.     0.     0.     0.    ]
  [0.     0.     0.     0.     0.     0.     0.     0.     0.     0.
   0.     0.     0.     0.     0.     0.     0.     0.     0.     0.
   0.     0.     0.     0.     0.     0.     0.     0.     0.     0.
   0.     0.     0.     0.     0.     0.     0.     0.     0.     0.
   0.     0.     0.     1.241  0.     0.     0.     0.92   0.     0.
   0.     0.     0.     0.     0.     0.     0.     0.     0.     0.
   0.     0.     0.     0.     0.     0.     0.     0.     0.     0.
   0.     0.     0.     0.     0.     0.     0.     0.     0.     0.
   0.     0.     0.     0.     0.     0.     0.     0.     0.     0.
   0.     0.     0.     0.     0.     0.     0.     0.     0.     0.    ]]]
========================================================================
[INFO 2022-12-06 09:51:59,357 common.py:105] CPU and GPU diff summary:
output shape: (3, 2, 100)
num  nan:  0
rms diff:  0.1355418091632615
max diff:  1.938 pos: (0, 1, 32) cpu: -2.207 gpu: -0.2683
min diff:  0.0
max_diff_rate:  1.726 pos: (2, 1, 47) cpu: 0.533 gpu: 1.453
num failed:  11
worst case:  1.672 pos: (0, 1, 32) cpu: -2.207 gpu: -0.2683
========================================================================
[INFO 2022-12-06 09:51:59,359 common.py:123] Test FAILED

For this kind of result, we generally cannot judge whether there is a problem with the cuda kernel or a problem with the hardware.