How many filters in Conv2d?

111353 · June 25, 2020, 12:11pm

I couldn’t understand how many filters used in Conv2d.
This is my code, and please see the picuture.
At first I thought fig1 was correct, but when I looked at the code, fig2 seems to be correct.
Can someone give me a reference on this matter?
Thank you for reading to the last.

import torch.nn as nn
import torch

class Net(nn.Module):
    def __init__(self):
        super(Net, self).__init__()
        self.conv1 = nn.Conv2d(in_channels = 2, out_channels = 2, kernel_size = 3)

    def forward(self, x):
        x = self.conv1(x)
        return x
net = Net()
print(net)

params = list(net.parameters())
print(params[0])
print(params[0].size())


======================================
~~ output on my colab, and I can see "4" (3*3)tensors. I thougt that are filters~~

Net(
  (conv1): Conv2d(2, 2, kernel_size=(3, 3), stride=(1, 1))
)
Parameter containing:
tensor([[[[-0.1645, -0.2205,  0.0995],
          [ 0.2017, -0.1659,  0.0161],
          [ 0.0099,  0.1740,  0.0792]],

         [[-0.1067,  0.1234,  0.0129],
          [-0.1366, -0.0107,  0.0756],
          [ 0.1778, -0.1056,  0.2191]]],


        [[[-0.0351,  0.0904, -0.1394],
          [-0.1006,  0.2080,  0.1312],
          [-0.1741, -0.0246, -0.0775]],

         [[-0.0482, -0.0906, -0.1982],
          [ 0.2164,  0.0711, -0.0212],
          [-0.0277, -0.0861, -0.1908]]]], requires_grad=True)

PresidentDoggo · June 25, 2020, 2:11pm

I think figure 2 is trying to show how the convolution operation is happening inside the convolution layer.
I made a diagram based off my understanding of convnets to try and help

There are two filters in the network as out_channel = 2.
in_channel = 2 and kernel_size = 3 therefore filters are of size [3 x 3 x 2].

In my diagram it show 2 [3 x 3 x 2] filters performing the convolution operation on the same input image. You have 4 tensor outputs because there are 4 [3 x 3] kernels.

EDIT:
The output of this operation is a feature map of size:

(𝑊−𝐹+2𝑃)/𝑆+1

𝑊 is the input volume
𝑃 is padding
𝐹 is filter size
𝑆 is stride

Hope this helps!

harsha_g · June 25, 2020, 2:25pm

You meant [output image height x output image width x out channels]. Didn’t you?

PresidentDoggo · June 25, 2020, 2:27pm

It’s the same dimensions as the input image is what I meant.

111353 · June 25, 2020, 2:28pm

Thank you sooo much, Mr PresidentDoggo.
I’m sorry but can I ask more?
I check circle in your figure, and Does this part add up? Or is it the norm or other Inkeddb0598d4d7d2779418c0df7f988d7bd5ef3fff77_2_690x334_LI

111353 · June 25, 2020, 2:28pm

harsha_g · June 25, 2020, 2:31pm

That’s only in the case of padding.

111353 · June 25, 2020, 2:36pm

Mr PresidentDoggo
Thank you very much for your kindness.
Your help was very useful to me.

I couldn’t understand to merge two layer in one layer yet,
Mr harsha_g, thank you for replying,
Could I ask detail?

PresidentDoggo · June 25, 2020, 2:54pm

@111353 you’re welcome

To answer your question about merging the two layers its summed and then the result is offset by the bias. [source: Stanford CS convnets] (Also a great place to learn more!)
Although I have seen some literature do an average.

@harsha_g is right, I made a slight mistake with the output!

The output of the conv operation is not the size of the input image
We can calculate the change in dimensionality from the CONV operation using this equation:

(𝑊−𝐹+2𝑃)/𝑆+1

𝑊 is the input volume
𝑃 is padding
𝐹 is filter size
𝑆 is stride

harsha_g · June 25, 2020, 3:15pm

@111353 see if this video can help clarify your doubt.

111353 · June 25, 2020, 11:05pm

Mr @PresidentDoggo
Thank you very much for giving me a source and helpful your information.
Mr @harsha_g
Thank you very much for giving me a nice video and joining this topics.