Hi, I am new in CNN and Pytorch. I am wondering how in the first layer the input channel is 3 and the output is 6 (self.conv1 = nn.Conv2d(3, 6, 5) How the volume size of the output calculated?
I know there is a formula W2=(W1−F+2P)/S+1 to calculate the output, and I read this http://cs231n.github.io/convolutional-networks/#architectures which explained very well but still I cannot understand why in this (self.conv1 = nn.Conv2d(3, 6, 5)) input is 3 and output channel is 6.?
For example, in the following script, I could understand the output channel of each layer is the input of the next layer but couldn’t figure out how it calculated in the first line while the size of the image is 32323
any help would appreciate.
self.conv1 = nn.Conv2d(3, 6, 5) self.pool = nn.MaxPool2d(2, 2) # creates a module which initializes weights etc self.conv2 = nn.Conv2d(6, 16, 5) #an affine operation: y = Wx + b self.fc1 = nn.Linear(16 * 5 * 5, 120) self.fc2 = nn.Linear(120, 84) self.fc3 = nn.Linear(84, 10)