I have noticed a disturbing pattern with pytorch (and other libraries too) but can’t get to the bottom of it!
Any operation which involves bias term gives different (wrong results!) I will illustrate with Conv2d module here!
Without Bias term!
import torch
import numpy as np
torch.set_printoptions(precision=32)
np.random.seed(23)
# Creating random datapoints of Batch size=5
x_data = np.random.random(size=(5,1,1,1)).astype('float32')
x = torch.as_tensor(x_data)
# Defining Network
net = torch.nn.Conv2d(in_channels=1,out_channels=1,kernel_size=1)
# Initializing random weights and zero biases
weight = np.random.random(size=(1,1,1,1)).astype('float32')
bias = np.zeros(shape=(1,)).astype('float32')
from collections import OrderedDict
parameters = OrderedDict()
parameters['weight'] = torch.as_tensor(weight)
parameters['bias'] = torch.as_tensor(bias)
# Loading the network parameters with custom W & B
net.load_state_dict(parameters)
output1 = net(x)
With Bias term
import torch
import numpy as np
torch.set_printoptions(precision=32)
np.random.seed(23)
# You can reinitialize x here or use the same x from before. Won't make a difference!
x_data = np.random.random(size=(5,1,1,1)).astype('float32')
x = torch.as_tensor(x_data)
net = torch.nn.Conv2d(in_channels=1,out_channels=1,kernel_size=1)
# Here too you can use the same weights as before, bias is randomly initialized!
weight = np.random.random(size=(1,1,1,1)).astype('float32')
bias = np.random.random(size=(1,)).astype('float32')
from collections import OrderedDict
parameters = OrderedDict()
parameters['weight'] = torch.as_tensor(weight)
parameters['bias'] = torch.as_tensor(bias)
net.load_state_dict(parameters)
output2 = net(x)
# You can even print the actual values and notice the difference!
print(output2-output1 == torch.as_tensor(bias))
My outputs are:
tensor([[[[ True]]],
[[[False]]],
[[[False]]],
[[[ True]]],
[[[False]]]])
This True-False sequence is totally random. Choose a different value of seed or batch_size, you will get a different result.
Although the error introduced by bias term is very small, but still it exists. Not able to know the reason is very frustrating. If you know the reason (and/or workaroud to avoid it), please help! Thanks!
Note: I have observed the same randomness in other modules like BatchNorm and other libraries like PaddlePaddle. Also I am running this on CPU.