# How many filters in Conv2d?

I couldn’t understand how many filters used in Conv2d.
This is my code, and please see the picuture.
At first I thought fig1 was correct, but when I looked at the code, fig2 seems to be correct.
Can someone give me a reference on this matter?
Thank you for reading to the last.

import torch.nn as nn
import torch

class Net(nn.Module):
def __init__(self):
super(Net, self).__init__()
self.conv1 = nn.Conv2d(in_channels = 2, out_channels = 2, kernel_size = 3)

def forward(self, x):
x = self.conv1(x)
return x
net = Net()
print(net)

params = list(net.parameters())
print(params[0])
print(params[0].size())

======================================
~~ output on my colab, and I can see "4" (3*3)tensors. I thougt that are filters~~

Net(
(conv1): Conv2d(2, 2, kernel_size=(3, 3), stride=(1, 1))
)
Parameter containing:
tensor([[[[-0.1645, -0.2205,  0.0995],
[ 0.2017, -0.1659,  0.0161],
[ 0.0099,  0.1740,  0.0792]],

[[-0.1067,  0.1234,  0.0129],
[-0.1366, -0.0107,  0.0756],
[ 0.1778, -0.1056,  0.2191]]],

[[[-0.0351,  0.0904, -0.1394],
[-0.1006,  0.2080,  0.1312],
[-0.1741, -0.0246, -0.0775]],

[[-0.0482, -0.0906, -0.1982],
[ 0.2164,  0.0711, -0.0212],

I think figure 2 is trying to show how the convolution operation is happening inside the convolution layer.
I made a diagram based off my understanding of convnets to try and help

There are two filters in the network as out_channel = 2.
in_channel = 2 and kernel_size = 3 therefore filters are of size [3 x 3 x 2].

In my diagram it show 2 [3 x 3 x 2] filters performing the convolution operation on the same input image. You have 4 tensor outputs because there are 4 [3 x 3] kernels.

EDIT:
The output of this operation is a feature map of size:

(𝑊−𝐹+2𝑃)/𝑆+1

𝑊 is the input volume
𝐹 is filter size
𝑆 is stride

Hope this helps!

You meant [output image height x output image width x out channels]. Didn’t you?

It’s the same dimensions as the input image is what I meant.

Thank you sooo much, Mr PresidentDoggo.
I’m sorry but can I ask more?
I check circle in your figure, and Does this part add up? Or is it the norm or other

That’s only in the case of padding.

Mr PresidentDoggo
Thank you very much for your kindness.
Your help was very useful to me.

I couldn’t understand to merge two layer in one layer yet,
Mr harsha_g, thank you for replying,

@111353 you’re welcome

To answer your question about merging the two layers its summed and then the result is offset by the bias. [source: Stanford CS convnets] (Also a great place to learn more!)
Although I have seen some literature do an average.

@harsha_g is right, I made a slight mistake with the output!

The output of the conv operation is not the size of the input image
We can calculate the change in dimensionality from the CONV operation using this equation:

(𝑊−𝐹+2𝑃)/𝑆+1

𝑊 is the input volume