Numerical Difference in Matrix Multiplication and Summation

I am computing a vector and matrix multiplication in two different ways. Mathematically they are equivalent, however, PyTorch gives different (slightly results for them). Can someone please explain to me why it happens and hopefully the slight difference can be ignored in practice.

import torch
import numpy as np
x = torch.from_numpy(np.array(range(12))).view(-1, 3, 4).float()
ww = torch.rand(5, 12)
y1 = torch.sum(x.view(-1, 12) * ww, dim=1) 
y2 = torch.matmul(x.view(-1, 12), ww.t())
print(y1 - y2)

It’s clear that y1 shoud equal to y2, but results are non-zero

tensor(1.00000e-06 *
       [[ 0.0000,  0.0000,  0.0000,  0.0000,  1.9073]])

A difference of approx 1e-6 is explainable due to the float32 precision.
In practice you can ignore it. If you need a higher precision, you could use float64 instead.
Note that your performance will suffer on the GPU.

Here is another example. Clearly the sums should yield the same number. However, due to the order of operations, you will get a small difference between both methods:

x = torch.randn(10, 10, 10)
print(x.sum() - x.sum(0).sum(0).sum(0))
> tensor(-0.000005722)