Unexpected number of convolution filters

Sohrab_Salimian · August 22, 2018, 3:51pm

I have a fairly elementary question but it is something that has caused me some trouble. So lets say we have a two layer convolutional network. In the first layer we have

Conv1 = Conv2d(1,2, stride = 1)
meaning that we have two filters for our input, producing two feature maps

in the second layer we have
Conv2 = Conv2d(2,2, stride = 1)
in this layer I would expect that we have two filters since the final output is two feature maps, but when i look into the weights we have 4 convolutional filters in the second convolutional layer. Why is this?

Sohrab_Salimian · August 22, 2018, 4:59pm

any ideas, from anyone?

ptrblck · August 22, 2018, 5:42pm

Your assumption is right! Your layers both have two filters with a different number of channels.

conv1 = nn.Conv2d(1, 2, 3, 1, 1)
print(conv1.weight.shape)
> torch.Size([2, 1, 3, 3])
conv2 = nn.Conv2d(2, 2, 3, 1, 1)
print(conv2.weight.shape)
> torch.Size([2, 2, 3, 3])

The filter shape is defined as [nb_filters, in_channels, h, w].
So besides the changing number of input_channels, we still have two filters.

Sohrab_Salimian · August 22, 2018, 5:45pm

i see but i still dont understand why there are 4 seperate filters in layer 2. Its almost like there 2 filters per incoming channel, when i only wanted 2 filters total…im sorry if this is a very simple question i’m just not understanding why there are 4 fitlers in the second convolutional layer

Sohrab_Salimian · August 22, 2018, 5:55pm

so we increase the number of channels from 1 to 2 going from convolution 1 to 2. We thus increase our filter number from 2 to 4, but our output channels leaving convolution 2 are still 2. thus we are applying 2 seperate sets of filters to each channel coming into convolution 2?

ptrblck · August 22, 2018, 6:00pm

No, we still have two filters in each layer. Each filter calculates the dot product in the input activation using all input channels.
Have a look at the alexnet architecture in Figure 2. You see that each filter has a depth in the input volume.

Also, have a look at the Convolution lecture of CS231n.
Some information:

The connections are local in space (along width and height), but always full along the entire depth of the input volume. For example, suppose that the input volume has size [32x32x3], (e.g. an RGB CIFAR-10 image). If the receptive field (or the filter size) is 5x5, then each neuron in the Conv Layer will have weights to a [5x5x3] region in the input volume, for a total of 553 = 75 weights (and +1 bias parameter). Notice that the extent of the connectivity along the depth axis must be 3, since this is the depth of the input volume.

Sohrab_Salimian · August 22, 2018, 6:03pm

so then in the second convolution layer my two filters have the shape, 2x3x3, thus their depth is 2 now where in the first layer it was 1? thank you so much for your help! in that case how would i visualize these depth 2 filters?

ptrblck · August 22, 2018, 6:06pm

Exactly!
Well, you could slice the channels and visualize each one as a gray image.
If you use color images (3 channels), the filters of your first conv layer will also have 3 channels, thus you could visualize them in color.

Sohrab_Salimian · August 22, 2018, 6:08pm

I see! Thank you very much! By slice you mean take 2x2x3x3 and visualize them as two seperate 2x3x3 images?

ptrblck · August 22, 2018, 6:18pm

I mean visualizing each slice of the two filters as a [3, 3] image.

Sohrab_Salimian · August 22, 2018, 6:19pm

i see thank you very much! i really appreciate the help