1*1 Conv2d functionality in Downsample of Resnet18 is different than other framework

Kothari_Gaurav · September 10, 2019, 3:29pm

In Pytorch Resnet class we have resnet18 architecture which uses Basic block and In that Basic block we have a sequential object called Downsample.
In Downsample pytorch is doing [1 * 1] conv2d functionality.

My question is why I am getting different output for [1 * 1]convolution in Pytorch in comparison to other framework like Darknet for the same operation.
Note - other [2 2] and [3 3] convolution are giving same output in other framework but only [1 * 1] convolution is giving very different output.

is there any different approach, which is Pytorch is using for the special case of [1 * 1]?
code
downsample = nn.Sequential(
conv1x1(self.inplanes, planes * block.expansion, stride),
norm_layer(planes * block.expansion),
)

usage
def forward(self, x):
identity = x
out = self.conv1(x)
out = self.bn1(out)
out = self.relu(out)
out = self.conv2(out)
out = self.bn2(out)
out = self.relu(out)
out = self.conv3(out)
out = self.bn3(out)
if self.downsample is not None:
identity = self.downsample(x)
out += identity
out = self.relu(out)
return out

ptrblck · September 10, 2019, 6:38pm

You could create a dummy example with small sizes and compute the result manually:

x = torch.randn(1, 2, 2, 2)
conv = nn.Conv2d(2, 1, 1)
output = conv(x)

print(x)
tensor([[[[-0.4150, -0.3450],
          [-0.2987,  1.7142]],

         [[ 1.2817, -0.7830],
          [-0.0724, -1.8714]]]])
print(conv.weight)
> tensor([[[[ 0.6824]],

         [[-0.5454]]]], requires_grad=True)
print(conv.bias)
> tensor([-0.5595], requires_grad=True)
print(output)
> 
tensor([[[[-1.5418, -0.3679],
          [-0.7238,  1.6310]]]], grad_fn=<MkldnnConvolutionBackward>)

# manual calculation for first value
print((-0.415*0.6824) + (1.2817*-0.5454) + (-0.5595))
> -1.5417351799999999

Try to create this small example in the other frameworks and try to check, how the result is calculated.