Convolution output different result

Eta_C · January 3, 2020, 6:07am

Here’s my code.

import torch
import torch.nn as nn


x = torch.rand(3, 3, 5).float()
weight = torch.rand(6, 3, 1).float()
bias = torch.rand(6).float()

net = nn.Conv1d(3, 6, 1, bias=True)
net.weight = nn.Parameter(weight.clone(), requires_grad=True)
net.bias = nn.Parameter(bias.clone(), requires_grad=True)
y_src = net(x.clone())

net_no_bias = nn.Conv1d(3, 6, 1, bias=False)
net_no_bias.weight = nn.Parameter(weight.clone(), requires_grad=True)
y_dst = net_no_bias(x) + bias.view(1, -1, 1)

print((y_src - y_dst).abs().max())  # tensor(2.3842e-07, grad_fn=<MaxBackward1>)

Why? I run this code on CPU and expect to get 0. What’s the different between them?

Eta_C · January 3, 2020, 6:18am

BTW, if the values in weight and bias are large, the deviation will be amplified to 0.5.

ptrblck · January 3, 2020, 7:55am

These errors are most likely due to the limited floating point precision:

torch.manual_seed(2809)

x = torch.randn(100, 100, 100)
y = torch.randn(100, 100, 100)

x_sum1 = x.sum(0).sum(0).sum(0)
x_sum2 = x.sum(2).sum(1).sum(0)
y_sum1 = y.sum(0).sum(0).sum(0)
y_sum2 = y.sum(2).sum(1).sum(0)

res1 = x_sum1 * y_sum1
res2 = x_sum2 * y_sum2 

print(x_sum1-x_sum2)
> tensor(-0.0002)
print(y_sum1-y_sum2)
> tensor(-0.0006)
print(res1-res2)
tensor(0.6250)

If you need more precision, you could use .double() precision.

Eta_C · January 3, 2020, 8:21am

Is there any other way to make y_src - y_dst equal to 0? I want to replace each convolution of my network to convolution_without_bias+bias_add. As the depth of the network increases, this phenomenon is amplified.

Convert all data to double would works, but waste too many memory.

ptrblck · January 3, 2020, 8:25am

I don’t think you can force the order of operation to be constant, at least I’m not aware of any way.
How large are your activation values, when you have the large error?
I assume you would like to get rid of the 0.5 in absolute difference.

Eta_C · January 3, 2020, 8:39am

Yes I want to get rid of the 0.5 in absolute difference.
I will try to use this result for subsequent operations. If the end result is acceptable, I think this phenomenon could be ignored.
God bless me.