Conv2d behaviour question

hanspinckaers · May 19, 2018, 4:30pm

I’m trying to understand the behaviour of the Conv2d layer in Pytorch, but it gives different results in scenarios in which I expect the same results.

Maybe I’m missing something?

Example (link to jupyter notebook):

import torch
torch.set_printoptions(precision=9)
torch.manual_seed(1)
data = torch.autograd.Variable( torch.FloatTensor(3, 3, 4).normal_(0, 1) )[None]

Data is:

tensor([[[[-1.525595903, -0.750231802, -0.653980911, -1.609484792],
          [-0.100167178, -0.609188914, -0.979772270, -1.609096289],
          [-0.712144613, 0.303721994, -0.777314305, -0.251455247]],

         [[-0.222270489, 1.687113404, 0.228425175, 0.467635512],
          [-0.696972430, -1.160761476, 0.699542403, 0.199081630],
          [0.199056506, 0.045702778, 0.152956918, -0.475678802]],

         [[-1.882142544, -0.776545048, 2.024202108, -0.086541198],
          [2.357110977, -1.037338734, 1.574798107, -0.629847229],
          [2.406978130, 0.278566241, 0.246752918, 1.184326649]]]])

Manual executions of convolutions
The first conv2d should be executed on the first 3x3 pixels (starting from top-left)

torch.sum(data[:,:,0:3,0:3] * 0.04)

Output:

tensor(1.00000e-02 *
       1.282003708)

The second conv2d should be:

torch.sum(data[:,:,0:3,1:4] * 0.04)

Output is:

tensor(1.00000e-02 *
       -9.257644415)

Using Conv2d layer

layer = torch.nn.Conv2d(3, 1, kernel_size=3)
layer.weight.data.fill_(0.04)
layer.bias.data.fill_(0)

layer(data)

Gives:

tensor(1.00000e-02 *
       [[[[1.282003429, -9.257646650]]]])

Strangely, they are both incorrect. In pytorch 0.3.1 the last one was correct. (meaning the same as the manual calculation)

Another strange behaviour with multiple filters:

layer = torch.nn.Conv2d(3, 2, kernel_size=3)
layer.weight.data.fill_(0.04)
layer.bias.data.fill_(0)

layer(data)

Gives:

tensor(1.00000e-02 *
       [[[[1.282002497, -9.257643670]],
         [[1.282002497, -9.257643670]]]])

This result is again different

Another behaviour I do not understand
Use the layer on the first part of the data

layer(data[:,:,0:3,0:3])

Gives:

tensor(1.00000e-02 *
       [[[[1.282002777]],
         [[1.282002777]]]])

If I understand it correctly, this should be the same as the first column from the results before this one.

Thanks for helping!
Hans

ptrblck · May 19, 2018, 9:54pm

The observed differences might be caused by different implementations or different order of the operations.
Since the differences are at approx. 1e-6 it’s likely the reason in my opinion.
Have a look at the following code:

a = torch.randn(10, 10, 10)
b = a.sum()
c = a.sum(2).sum(1).sum(0)
print(b)
print(c)

hanspinckaers · May 27, 2018, 7:12am

Thanks! That makes sense.