# SpatialConvolution / Conv2d: different results using Pytorch and torch7 for float tensors

Hi,
I tried to use convolution in Pytorch and torch7 (lua).
Operations with the same tensors (with float type) produce different results.

Python code:

``````import torch
import torch.nn as nn

torch.set_default_tensor_type('torch.FloatTensor')

def test_conv():
layer = nn.Conv2d(64, 64, kernel_size=3, stride=1, padding=1, bias=True)
layer.weight.data.fill_(2.2)
layer.bias.data.fill_(1.2)

tensor = torch.zeros((1, 64, 256, 256))
tensor.fill_(1.3)
# print(tensor)

result = layer(tensor)
print("[test_conv] result: shape={ %s }, type='%s'\n" % (result.shape, result.type()) )

result_flatten = result.flatten()

i = 0
for n in result_flatten:
if i >= 30: break
number = n.item()
print("[%d] %f" % (i+1, number) )
i += 1

test_conv()
``````

Result:

`````` [1] 733.357605
[2] 1099.435791
[3] 1099.435791
...
[30] 1099.435791
``````

Torch7 lua code:

``````require 'nn'

torch.setdefaulttensortype('torch.FloatTensor')

function test_conv()
local kernel_size = 3
local stride = 1
layer.weight:fill(2.2) -- fill weigths with 2.2
layer.bias:fill(1.2)   -- fill weigths with 1.2

local tensor = torch.Tensor(1, 64, 256, 256)
tensor:fill(1.3)       -- fill tesnor with 1.3
-- print(tensor)

local result = layer(tensor)

print(string.format("result: shape={ %s }, type='%s'\n", result:size(), result:type()) )

local result_flatten = result:view(result:nElement())
for i = 1, 30 do
print(string.format("[%d] %f", i, result_flatten[i]))
end
end

test_conv()
``````

Result:

``````[1] 733.360107
[2] 1099.439697
[3] 1099.439697
....
[30] 1099.437012
``````

Difference between results:

``````pytorch                      torch7
[1] 733.357605          [1] 733.360107
[2] 1099.435791        [2] 1099.439697
[3] 1099.435791        [3] 1099.439697
....
[30] 1099.435791      [30] 1099.437012
``````

Maybe it was caused by different float-point arithmetic in Pytorch and torch7?
or should use a different convolution operator?

As you suspect, differences like this are within numerical accuracy and thus would be expected between different implementations of the same operation (this is 3e-3 on a number of size 7e3, so ~2e-6ish relative error, which seems not unusual).

Best regards

Thomas

Thanks for your reply. I’m trying to use PyTorch pre-trained model in torch7 (lua) and C (libTNN). Seems these small differences may cause huge errors in the output of the entire network (which contains a lot of convolution layers).

Yeah, unfortunate as it is, something isn’t terribly robust about the model because errors of that size might happen even when switching backends within PyTorch.
You could try to train the PyTorch model for a few steps to give answers that more closely match the torch7 ones.