Sum of same values return different result with PyTorch but not Numpy


I really cannot explain myself what is going on here: I have two Tensors of size (1000, 2, 2) and (1000, 2 , 5) which share the same values for the first two slices of the third dimension, i.e. [:,:,:2].

z0 = torch.rand(1000, 2, 2)
z1 =[z0, torch.rand(1000, 2, 3)], dim=2)

s0 = z0[:, :, :2].sum()
s1 = z1[:, :, :2].sum()

But when I sum over the same slice I do not get the same result, I have a feeling there’s some floating point precision issue but I don’t see how it can apply in this case since the elements of the tensor are exactly equal. Numpy, on the other hand, does not show this behaviour.

>>> torch.all(z0[:, :, :2] == z1[:, :, :2])
>>> s0 == s1
>>> (s0.sum() - s1.sum()).abs().sum().numpy()
>>> z0.numpy()[:, :, :2].sum() == z1.numpy()[:, :, :2].sum()

Any clue on why is this happening?

EDIT: if I switch the dimensions such that now I am summing over the first, i.e. z1.shape = [5,2,1000] and s1=z1[:2,:,:].sum() (same applies for z0 and s1) I don’t have the numerical issue. Why?

My guess would be that different algorithms could be picked depending on the memory layout of the data and its alignment.
(s0.sum() - s1.sum()).abs().max().numpy() returns ~array(1.6212463e-05, dtype=float32) which fits the range of the expected float32 floating point precision.