Convolution output different result

Here’s my code.

import torch
import torch.nn as nn


x = torch.rand(3, 3, 5).float()
weight = torch.rand(6, 3, 1).float()
bias = torch.rand(6).float()

net = nn.Conv1d(3, 6, 1, bias=True)
net.weight = nn.Parameter(weight.clone(), requires_grad=True)
net.bias = nn.Parameter(bias.clone(), requires_grad=True)
y_src = net(x.clone())

net_no_bias = nn.Conv1d(3, 6, 1, bias=False)
net_no_bias.weight = nn.Parameter(weight.clone(), requires_grad=True)
y_dst = net_no_bias(x) + bias.view(1, -1, 1)

print((y_src - y_dst).abs().max())  # tensor(2.3842e-07, grad_fn=<MaxBackward1>)

Why? I run this code on CPU and expect to get 0. What’s the different between them?

BTW, if the values in weight and bias are large, the deviation will be amplified to 0.5.

These errors are most likely due to the limited floating point precision:

torch.manual_seed(2809)

x = torch.randn(100, 100, 100)
y = torch.randn(100, 100, 100)

x_sum1 = x.sum(0).sum(0).sum(0)
x_sum2 = x.sum(2).sum(1).sum(0)
y_sum1 = y.sum(0).sum(0).sum(0)
y_sum2 = y.sum(2).sum(1).sum(0)

res1 = x_sum1 * y_sum1
res2 = x_sum2 * y_sum2 

print(x_sum1-x_sum2)
> tensor(-0.0002)
print(y_sum1-y_sum2)
> tensor(-0.0006)
print(res1-res2)
tensor(0.6250)

If you need more precision, you could use .double() precision.

Is there any other way to make y_src - y_dst equal to 0? I want to replace each convolution of my network to convolution_without_bias+bias_add. As the depth of the network increases, this phenomenon is amplified.

Convert all data to double would works, but waste too many memory.:cry:

I don’t think you can force the order of operation to be constant, at least I’m not aware of any way.
How large are your activation values, when you have the large error?
I assume you would like to get rid of the 0.5 in absolute difference.

Yes I want to get rid of the 0.5 in absolute difference.
I will try to use this result for subsequent operations. If the end result is acceptable, I think this phenomenon could be ignored.
God bless me.