While testing a very deep convolutional network, I noticed that there is no padding ='SAME'
option, like tensorflow has. What I did was to set the padding inside the convolutional layer, like so
self.conv3 = nn.Conv2d(in_channels=10, out_channels=10, kernel_size=3, stride=1, padding=(1,1))
This works in terms of preserving dimensionality, but what I am worried by is that it applies padding after the convolution, so that the last layers actually perform convolutions over an array of zeros. My network is also not training.
The dataset I am using is CIFAR-10 , so, without proper padding before the convolution, the height and width of the image goes to zero very fast (after 3-4 layers).
How can I get around that?
EDIT: If I print out the first example in a batch, of shape [20, 16, 16]
, where 20 is the number of channels from the previous convolution, it looks like this:
(1 ,.,.) =
1.00000e-02 *
0.0000 0.0000 0.0000 ... 0.0000 0.0000 0.0000
0.0000 0.0000 0.0000 ... 0.0000 0.0000 0.0000
0.0000 0.0000 0.0000 ... 0.0000 0.0000 0.0000
... ⋱ ...
0.0000 0.0000 0.0000 ... 0.0000 0.0000 0.0000
0.0000 0.0000 0.0000 ... 0.0000 0.0000 0.0000
0.0000 0.0000 0.0000 ... 0.0000 0.0000 0.0000
(2 ,.,.) =
1.00000e-02 *
0.0000 0.0000 0.0000 ... 0.0000 0.0000 0.0000
0.0000 0.0000 0.0000 ... 0.0000 0.0000 0.0000
0.0000 0.0000 0.0000 ... 0.0000 0.0000 0.0000
... ⋱ ...
0.0000 0.0000 0.0000 ... 0.0000 0.0000 0.0000
0.0000 0.0000 0.0000 ... 0.0000 0.0000 0.0000
0.0000 0.0000 0.0000 ... 0.0000 0.0000 0.0000
...
(17,.,.) =
1.00000e-02 *
0.0000 0.0000 0.0000 ... 0.0000 0.0000 0.0000
0.0000 0.0000 0.0000 ... 0.0000 0.0000 0.0000
0.0000 0.0000 0.0000 ... 0.0000 0.0000 0.0000
... ⋱ ...
0.0000 0.0000 0.0000 ... 0.0000 0.0000 0.0000
0.0000 0.0000 0.0000 ... 0.0000 0.0000 0.0000
0.0000 0.0000 0.0000 ... 0.0000 0.0000 0.0000
(18,.,.) =
1.00000e-02 *
0.0000 0.0000 0.0000 ... 0.0000 0.0000 0.0000
0.0000 0.0000 0.0000 ... 0.0000 0.0000 0.0000
0.0000 0.0000 0.0000 ... 0.0000 0.0000 0.0000
... ⋱ ...
0.0000 0.0000 0.0000 ... 0.0000 0.0000 0.0000
0.0000 0.0000 0.0000 ... 0.0000 0.0000 0.0000
0.0000 0.0000 0.0000 ... 0.0000 0.0000 0.0000
(19,.,.) =
1.00000e-02 *
6.4769 7.6986 7.5997 ... 7.5947 7.5006 6.4277
6.7377 6.7590 6.2768 ... 6.3319 6.1432 5.2836
6.6169 6.4841 6.1549 ... 6.1608 6.0279 5.2591
... ⋱ ...
6.5688 6.4459 6.0924 ... 6.1113 6.0179 5.2809
6.4056 5.7569 5.3210 ... 5.3467 5.2885 4.7401
5.2931 5.1357 4.9795 ... 4.9808 4.8801 4.4145
[torch.cuda.FloatTensor of size 20x16x16 (GPU 0)]
Basically everything is zero, except for one channel. Any idea why this is?