Number of channels in the input image

What is meant by " Number of channels in the input image" and " Number of channels in the output image" in paramaters of torch.nn.Conv2d here:http://pytorch.org/docs/master/nn.html

I’m sorry I’m new, but can’t find asnwer…some people say it’s colour channels, but then why it’s 1,6 and 6,16 in beginner tutorial(http://pytorch.org/tutorials/beginner/blitz/neural_networks_tutorial.html)? Or are these channels number of different filters in one bundle of convolution layers?

In general, a “2d” tensor in CNNs is of size “Batch x Channels x Height x Width.” For the actual input to the network, channels is usually 3 for RGB or 1 if it’s greyscale. For the outputs of layers in the network, “output channels” is analagous to the number of neurons, or the number of hidden units, of a layer. So, the latter–output channels are the number of filters in one layer, while input channels are the number of filters in the incoming layer.

2 Likes

Where would you see this and be able to change in a typical pytorch CNN?

Usually you specify the number of channels when defining your architecture (see, e.g. this example: https://github.com/AghdamAmir/3D-UNet/blob/main/unet3d.py#L132), if you are working with pre-defined architectures there is not much you can do in terms of changing the number of input channels. If you have a greyscale image, and you wish to use it in a 3-channel RGB-indended network, you can always duplicate the dimension on run it like that.

See also, for instance, this link on defining a neural network layer, with specific number of input channels: Conv3d — PyTorch 2.0 documentation