Question about matrix multiplication precisions

During my training, I found the below calculations don’t equal. I can’t figure out the reason. Could you please help me?

    a=torch.tensor([[0.0641, 0.0434, 0.7166, 1.0000],
            [0.0642, 0.0434, 0.7170, 1.0000]])
    T=torch.tensor([[9.3945e-01, 0.0000e+00, -3.4204e-01, 0.0000e+00],
            [-2.9890e-08, 1.0000e+00, -8.2131e-08, -3.5527e-15],
            [3.4204e-01, 8.7428e-08, 9.3945e-01, 7.0020e-01],
            [0.0000e+00, 0.0000e+00, 0.0000e+00, 1.0000e+00]])

    res_1 = (T @ a[...,None]).reshape(2,4)
    res_2 = (T @ a.T).T
    print(res_1-res_2) # not zero

These small differences are caused by a different order of operation and the limited floating point precision. These docs explain it in more detail.

Thanks! I understand