Wierd behavior of torch.matmul

Hi,

I’m trying to implement Gaussian Process with pytorch, and faced some wierd behavior in using torch.matmul function which wonderd me whether i’m using it in correct way!

k1 = tensor([[1.0000, 1.0000, 0.9998],
             [1.0000, 1.0000, 1.0000],
             [0.9998, 1.0000, 1.0000]], dtype=torch.float64)

k2 = tensor([[1.0000, 1.0000, 0.9998, 0.9968],
             [1.0000, 1.0000, 1.0000, 0.9976],
             [0.9998, 1.0000, 1.0000, 0.9982]], dtype=torch.float64)

alphas = tensor([[-12851.3568],
                 [ 28321.3013],
                 [-15470.6538]], dtype=torch.float64)

now comes the wierd result, the first third elemets of y2 should be the same like the y1 but they are totally different!

y1 = torch.matmul(k1.T, alphas) = tensor([[0.8400],
                                          [0.9900],
                                          [0.2900]], dtype=torch.float64)

y2 = torch.matmul(k2.T, alphas) = tensor([[ 0.9685],
                                          [ 0.7068],
                                          [ 0.4447],
                                          [-1.1308]], dtype=torch.float64)

What did I do here wrong?!

thanks in advance for your tips

They are giving the same values, please check your result again

Thanks for the replay, Its intressting! cause when I create this two Tenosrs the output is as expected correct, but when they(k1, k2) come from my kernel the result is wrong and equal to what I postet!

In order to regenerate the problem correctly, I’ll add like below my kernel.

import torch

class SQE:
    def __init__(self, l, sigma_f, noise):
        self.l = l
        self.sigma_f = sigma_f
        self.noise = noise

    def __call__(self, xi, xj):
        xi_ = xi.div(self.l)
        xj_ = xj.div(self.l)

        x1_eq_x2 = torch.equal(xi, xj)
        adj = xi_.mean(-2, keepdim=True)
        x1 = xi_ - adj
        x2 = xj_ - adj

        x1_norm = x1.pow(2).sum(dim=-1, keepdim=True)
        x1_pad = torch.ones_like(x1_norm)

        if x1_eq_x2:
            x2_norm, x2_pad = x1_norm, x1_pad
        else:
            x2_norm = x2.pow(2).sum(dim=-1, keepdim=True)
            x2_pad = torch.ones_like(x2_norm)
        
        x1_ = torch.cat([-2.0 * x1, x1_norm, x1_pad], dim=-1)
        x2_ = torch.cat([x2, x2_pad, x2_norm], dim=-1)
        res = x1_.matmul(x2_.transpose(-2, -1))

        if x1_eq_x2:
            res.diagonal(dim1=-2, dim2=-1).fill_(0)

        res.clamp_min_(0)
        res.div_(-2).exp_()

        res = res * self.sigma_f
        if x1_eq_x2:
            res.add_(torch.eye(res.shape[0], dtype=torch.float64) * self.noise)

        return res

then we’ve the following samples:


x_1 = torch.tensor([[0.12], [0.13], [0.14]], dtype=torch.float64)
y_1 = torch.tensor([[0.84], [0.99], [0.29]], dtype=torch.float64)

x_2 = torch.tensor([[0.12], [0.13], [0.14], [0.2]], dtype=torch.float64)
y_2 = torch.tensor([[0.84], [0.99], [0.29], [0.84]], dtype=torch.float64)

defining a kernel like this:

l = torch.tensor(1, dtype=torch.float64)
sf = torch.tensor(1, dtype=torch.float32)

kernel = SQE(l, sf, 1e-5)

than the calculating k1, k2 and alphas like below:

k1 = kernel(x_1, x_1)
k2 = kernel(x_1, x_2)

L = torch.linalg.cholesky(k1)
s1 = torch.linalg.solve_triangular(L, y_1, upper=False)
alphas = torch.linalg.solve_triangular(L.t(), s1, upper=True)

now the out put is like what I posted before!

Your k1 and k2 kernels have a numerical mismatch of 1e-5 in their diagonal:

kernel = SQE(l, sf, 1e-5)

k1 = kernel(x_1, x_1)
k2 = kernel(x_1, x_2)

print(k2[:, :3] - k1)
# tensor([[-1.0000e-05,  0.0000e+00,  0.0000e+00],
#         [ 0.0000e+00, -1.0000e-05,  0.0000e+00],
#         [ 0.0000e+00,  0.0000e+00, -1.0000e-05]], dtype=torch.float64)

which then increases the final numerical mismatch in the matmul:

y1 = torch.matmul(k1.T, alphas)
y2 = torch.matmul(k2.T, alphas)

print(y1 - y2[:3])
# tensor([[-0.1285],
#         [ 0.2832],
#         [-0.1547]], dtype=torch.float64)

Since alphas contains values in the range ~1e4.

You will get the same result if you manually add this mismatch:

k1 = torch.tensor([[1.0000, 1.0000, 0.9998],
                   [1.0000, 1.0000, 1.0000],
                   [0.9998, 1.0000, 1.0000]], dtype=torch.float64)

k2 = torch.tensor([[1.0000, 1.0000, 0.9998, 0.9968],
                   [1.0000, 1.0000, 1.0000, 0.9976],
                   [0.9998, 1.0000, 1.0000, 0.9982]], dtype=torch.float64)

print(k2[:, :3] - k1)
# tensor([[0., 0., 0.],
#         [0., 0., 0.],
#         [0., 0., 0.]], dtype=torch.float64)

alphas = torch.tensor([[-12851.3568],
                       [ 28321.3013],
                       [-15470.6538]], dtype=torch.float64)

y1 = torch.matmul(k1.T, alphas)
y2 = torch.matmul(k2.T, alphas)
print(y1 - y2[:3])
# tensor([[0.],
#         [0.],
#         [0.]], dtype=torch.float64)

# add numerical mismatches
k1[torch.arange(3), torch.arange(3)] += 1e-5
y1 = torch.matmul(k1.T, alphas)
y2 = torch.matmul(k2.T, alphas)

print(y1 - y2[:3])
# tensor([[-0.1285],
#         [ 0.2832],
#         [-0.1547]], dtype=torch.float64)
1 Like

Thanks, nice catch, that’s it!