I understand that during convolution process that using specific kernel size we do operation input(image) * kernel.
then, as i saw the pytorch Conv2d,
i notice that we could get our desired output_channel.
if i have rgb image so in_channel will be 3 and i could create output channel 15. what happen during this process to get 15 output channel from 3 channel? do we randomize weight in kernel to get 15 channel?
The number of output channels corresponds to the number of convolution filters/kernels you want this layer to have. You should maybe read some tutorials on how convolutions work to get what I mean by kernel, maybe this one could be helpful.
Here is a little snippet to get a better grasp on how it works in PyTorch:
batch = torch.rand(16, 3, 100, 100) # N, C, H, W
conv = torch.nn.Conv2d(
in_channels=3, # RGB channels
out_channels=7, # Number of kernels
kernel_size=5, # Size of kernels, i. e. of size 5x5
print(conv.weight.size()) # 7 x 5 x 5 x 3 (7 kernels of size 5x5 having 3 of depth)
The feature maps of each kernel are concatenated in the second dimension to for a
(16, 7, 100, 100) Tensor, this is how you “increase” the number of channels.
yes, your example is good.
thank you for your response.
This example code helps me a lot Thanks
@LeviViana what is N,C,H,W in the second line of your snippet
N -> the batch size
C -> Nb of channels
H -> Height
W -> Width
So, to perform inference on a batch of 10 RGB images of size 100 x 200, you’ll have N=10, C=3, H=100, W=200.
Suppose that you performed a 2d-convolution with 128 kernels of size 3x3 and padding=1, you’ll have a feature map of size N=10, C=128, H=100, W=200.
@LeviViana Thanks for the help,