Padding for convolutions

Bixqu · August 7, 2017, 2:46pm

While testing a very deep convolutional network, I noticed that there is no padding ='SAME' option, like tensorflow has. What I did was to set the padding inside the convolutional layer, like so

    self.conv3 = nn.Conv2d(in_channels=10, out_channels=10, kernel_size=3, stride=1, padding=(1,1))

This works in terms of preserving dimensionality, but what I am worried by is that it applies padding after the convolution, so that the last layers actually perform convolutions over an array of zeros. My network is also not training.

The dataset I am using is CIFAR-10 , so, without proper padding before the convolution, the height and width of the image goes to zero very fast (after 3-4 layers).

How can I get around that?

EDIT: If I print out the first example in a batch, of shape [20, 16, 16], where 20 is the number of channels from the previous convolution, it looks like this:

(1 ,.,.) = 
1.00000e-02 *
  0.0000  0.0000  0.0000  ...   0.0000  0.0000  0.0000
  0.0000  0.0000  0.0000  ...   0.0000  0.0000  0.0000
  0.0000  0.0000  0.0000  ...   0.0000  0.0000  0.0000
           ...             ⋱             ...          
  0.0000  0.0000  0.0000  ...   0.0000  0.0000  0.0000
  0.0000  0.0000  0.0000  ...   0.0000  0.0000  0.0000
  0.0000  0.0000  0.0000  ...   0.0000  0.0000  0.0000

(2 ,.,.) = 
1.00000e-02 *
  0.0000  0.0000  0.0000  ...   0.0000  0.0000  0.0000
  0.0000  0.0000  0.0000  ...   0.0000  0.0000  0.0000
  0.0000  0.0000  0.0000  ...   0.0000  0.0000  0.0000
           ...             ⋱             ...          
  0.0000  0.0000  0.0000  ...   0.0000  0.0000  0.0000
  0.0000  0.0000  0.0000  ...   0.0000  0.0000  0.0000
  0.0000  0.0000  0.0000  ...   0.0000  0.0000  0.0000
...

(17,.,.) = 
1.00000e-02 *
  0.0000  0.0000  0.0000  ...   0.0000  0.0000  0.0000
  0.0000  0.0000  0.0000  ...   0.0000  0.0000  0.0000
  0.0000  0.0000  0.0000  ...   0.0000  0.0000  0.0000
           ...             ⋱             ...          
  0.0000  0.0000  0.0000  ...   0.0000  0.0000  0.0000
  0.0000  0.0000  0.0000  ...   0.0000  0.0000  0.0000
  0.0000  0.0000  0.0000  ...   0.0000  0.0000  0.0000

(18,.,.) = 
1.00000e-02 *
  0.0000  0.0000  0.0000  ...   0.0000  0.0000  0.0000
  0.0000  0.0000  0.0000  ...   0.0000  0.0000  0.0000
  0.0000  0.0000  0.0000  ...   0.0000  0.0000  0.0000
           ...             ⋱             ...          
  0.0000  0.0000  0.0000  ...   0.0000  0.0000  0.0000
  0.0000  0.0000  0.0000  ...   0.0000  0.0000  0.0000
  0.0000  0.0000  0.0000  ...   0.0000  0.0000  0.0000

(19,.,.) = 
1.00000e-02 *
  6.4769  7.6986  7.5997  ...   7.5947  7.5006  6.4277
  6.7377  6.7590  6.2768  ...   6.3319  6.1432  5.2836
  6.6169  6.4841  6.1549  ...   6.1608  6.0279  5.2591
           ...             ⋱             ...          
  6.5688  6.4459  6.0924  ...   6.1113  6.0179  5.2809
  6.4056  5.7569  5.3210  ...   5.3467  5.2885  4.7401
  5.2931  5.1357  4.9795  ...   4.9808  4.8801  4.4145
[torch.cuda.FloatTensor of size 20x16x16 (GPU 0)]

Basically everything is zero, except for one channel. Any idea why this is?

fmassa · August 7, 2017, 3:30pm

The padding option appends zeros before the convolution (in the input), pretty much like SAME option in TF.
If you want, you can also use F.pad with reflect or replicate mode, with you don’t want to pad the input with zeros.

Bixqu · August 7, 2017, 3:31pm

Thanks for the answer, I updated my original post above.

fmassa · August 7, 2017, 3:48pm

There are plenty of reasons why it could be failing.

Are you using ReLU? If your bias in the convolution are very negative, you could be in the 0 zone of the ReLU
Are you using Batch Norm? It could help with ReLU, or maybe you should check the initialization of your weights?

If you think that the padding is the problem, try padding with F.pad before the convolution, and remove the padding from there, but I doubt it is the cause.

Bixqu · August 8, 2017, 8:24am

But I still do not understand why all feature maps are zero except for the last one.