I am computing a vector and matrix multiplication in two different ways. Mathematically they are equivalent, however, PyTorch gives different (slightly results for them). Can someone please explain to me why it happens and hopefully the slight difference can be ignored in practice.
import torch
import numpy as np
x = torch.from_numpy(np.array(range(12))).view(-1, 3, 4).float()
ww = torch.rand(5, 12)
y1 = torch.sum(x.view(-1, 12) * ww, dim=1)
y2 = torch.matmul(x.view(-1, 12), ww.t())
print(y1 - y2)
It’s clear that y1 shoud equal to y2, but results are non-zero
tensor(1.00000e-06 *
[[ 0.0000, 0.0000, 0.0000, 0.0000, 1.9073]])