I am a beginner in Pytorch. When I was studying the pytorch example for beginners, I found that the output_channel in the previous layer was equal to the output channel to the next layer. Why was that? Also, the input features for fc1 was 800. How to derive that number? Thanks a lot!
Copy-pasting the Conv2d definition from Pytorch Documentation (shrinked):
nn.Conv2d(in_channels, out_channels, kernel_size, stride, padding)
So, 20 in line 14 is the number of input channels to that layer. In Pytorch, you do have to state explicitly the input size of the layer.
Coming to that 800 number, the MNIST images are of size 28x28 and the order of operations is defined in the forward function. The sizes for successive layers are:
self.conv1(x): ((28 - 5 + 1), (28 - 5 + 1)) = (24, 24)
F.max_pool2d(x): ((24/2), (24/2)) = (12, 12)
self.conv2(x): ((12 - 5 + 1), (12 - 5 + 1)) = (8, 8)
F.max_pool2d(x): ((8/2), (8/2)) = (4, 4)
Now the number of channels in the output convolutional layer is 50, so the number of values are 50 * 4 * 4 = 800
If you are not familiar with convolutional arithmetic, I will suggest this:
A Guide to Convolutional Arithmetic