The output of torch.bmm() is not equal to the corresponding numpy multiplication

Albert_Yang · June 15, 2017, 1:44pm

There are the Tensor x with the shape (3,2,3), and the Tensor y with the shape (3,3,480000), and then committing torch.bmm(x,y) noted as Z.

In the meantime, I got Z_n by using torch.from_numpy(x.numpy() @ y.numpy()), and I find that Z is not equal to Z_n…Why is that???

The code like this:
X=torch.Tensor(x) #size(3,2,3) Y=torch.Tensor(y) #size(3,3,480000) Z=torch.bmm(X,Y) #size(3,2,480000) Z_n=torch.from_numpy(X.numpy() @ Y.numpy())

fmassa · June 15, 2017, 6:29pm

I just tried your snippet with random numbers, and the difference I got (for floats) was in the order of 1e-7, so it seems to be equivalent to me. Do you have a reproducible snippet?

For reference, here is what I used:

import torch
import numpy as np

X = torch.rand(3, 2, 3)
Y = torch.rand(3, 3, 480000)

Z = torch.bmm(X, Y)
Z_n=torch.from_numpy(X.numpy() @ Y.numpy())

print((Z - Z_n).abs().max())

Albert_Yang · June 16, 2017, 1:24am

@fmassa
The relevant code is:

batch=torch.from_numpy(batch) #batch is the matrix of a image ranged from 0 to 1 with batch size 3, size(3,1200,1600,3)
initial=np.array([[2,0,0],[0,2,0]])
initial=initial.astype('float32')
initial=initial.flatten()
W=torch.zeros(1200*1600*3,6)
b=torch.Tensor(initial)
h=torch.from_numpy(torch.zeros(3,1200*1600*3).numpy() @ W.numpy() +b.numpy())
h=h.view(-1,2,3).type(torch.Tensor) #size(3,2,3)
x_t=torch.mm(torch.ones(600,1),torch.unsqueeze(torch.linspace(-1.0,1.0,800),1).transpose(1,0)) y_t=torch.mm(torch.unsqueeze(torch.linspace(-1.0,1.0,600),1),torch.ones(1,800))
x_t_flat=x_t.view(1,-1)
y_t_flat=y_t.view(1,-1)
ones=torch.ones(x_t_flat.size())
grid=torch.cat([x_t_flat,y_t_flat,ones],0)
grid=grid.view(-1).repeat(3,1).view(3,3,-1) #size(3,3,480000)

First,I use this:

T=torch.from_numpy(h.numpy() @ grid.numpy())

and then another way:

T_another=torch.bmm(h,grid)

the subsequent code is omitted.

and I have the last output of the image matrix that can’t imshow() normally with the second way but the first is OK. When I look back on the code, I just found the T is unequal to T_another.

fmassa · June 16, 2017, 8:05am

It’s very difficult to debug anything with the snippet you sent. Check the value of the difference between both matrices, I suspect you have either nans or the difference is such that you go slightly beyond 0 or 1

Albert_Yang · June 19, 2017, 11:47am

Now I have found a strange thing,
I have a tensor theta with the size (3,2,3)

theta

2 0 0
0 2 0
[torch.FloatTensor of size 2x3]

2 0 0
0 2 0
[torch.FloatTensor of size 2x3]

2 0 0
0 2 0
[torch.FloatTensor of size 2x3]

and a tensor grid with the size (3,3,480000)

grid

-1.0000 -0.9975 -0.9950 … 0.9950 0.9975 1.0000
-1.0000 -1.0000 -1.0000 … 1.0000 1.0000 1.0000
1.0000 1.0000 1.0000 … 1.0000 1.0000 1.0000
[torch.FloatTensor of size 3x480000]

-1.0000 -0.9975 -0.9950 … 0.9950 0.9975 1.0000
-1.0000 -1.0000 -1.0000 … 1.0000 1.0000 1.0000
1.0000 1.0000 1.0000 … 1.0000 1.0000 1.0000
[torch.FloatTensor of size 3x480000]

-1.0000 -0.9975 -0.9950 … 0.9950 0.9975 1.0000
-1.0000 -1.0000 -1.0000 … 1.0000 1.0000 1.0000
1.0000 1.0000 1.0000 … 1.0000 1.0000 1.0000
[torch.FloatTensor of size 3x480000]

then I use torch.bmm:

torch.bmm(theta,grid)

( 0 ,.,.) =
-2.0000 -1.9950 -1.9900 … 0.0000 0.0000 0.0000
-2.0000 -2.0000 -2.0000 … 0.0000 0.0000 0.0000

( 1 ,.,.) =
-2.0000 -1.9950 -1.9900 … 0.0000 0.0000 0.0000
-2.0000 -2.0000 -2.0000 … 0.0000 0.0000 0.0000

( 2 ,.,.) =
-2.0000 -1.9950 -1.9900 … 0.0000 0.0000 0.0000
-2.0000 -2.0000 -2.0000 … 0.0000 0.0000 0.0000
[torch.FloatTensor of size 3x2x480000]

and obviously the output is wrong, the right answer is:

theta.numpy() @ grid.numpy()

( 0 ,.,.) =
-2.0000 -1.9950 -1.9900 … 1.9900 1.9950 2.0000
-2.0000 -2.0000 -2.0000 … 2.0000 2.0000 2.0000

( 1 ,.,.) =
-2.0000 -1.9950 -1.9900 … 1.9900 1.9950 2.0000
-2.0000 -2.0000 -2.0000 … 2.0000 2.0000 2.0000

( 2 ,.,.) =
-2.0000 -1.9950 -1.9900 … 1.9900 1.9950 2.0000
-2.0000 -2.0000 -2.0000 … 2.0000 2.0000 2.0000
[torch.FloatTensor of size 3x2x480000]

so I wonder how it is happening?

Albert_Yang · June 19, 2017, 12:04pm

The reason seems to be that the dimension of matrix is too huge, and pytorch will omit the bottom part of the matrix. Are there any solutions?

fmassa · June 20, 2017, 7:27pm

If you can come up with a small minima working example (with random inputs) that reproduces the problem, could you please open an issue in pytorch?

Albert_Yang · June 21, 2017, 12:55am

I will have a try, thank you.