Hello. New here and new to ML!
I have a query concerning the PyTorch CIFAR 10 Example. Please bear with me and the potential mistakes i’ll be making. In CIFAR 10, we defined the class NET as follows:
self.conv1 = nn.Conv2d(in_channels=3, out_channels=6, kernel_size=5)
self.pool = nn.MaxPool2d(2, 2)
self.conv2 = nn.Conv2d(in_channels=6, out_channels=16, kernel_size=5)
self.fc1 = nn.Linear(16 * 5 * 5, 120)
self.fc2 = nn.Linear(120, 84)
self.fc3 = nn.Linear(84, 10)
def forward(self, x): x = self.pool(F.relu(self.conv1(x))) x = self.pool(F.relu(self.conv2(x))) x = x.view(-1, 16 * 5 * 5) x = F.relu(self.fc1(x)) x = F.relu(self.fc2(x)) x = self.fc3(x) return x
My question specifically concerns: self.fc1 = nn.Linear(16 * 5 * 5, 120).
I understand we have 16 activation maps but how can each activation map be 5 by 5?
Considering the orignal image which is 32 by 32, i applied convolution on it and its dimensions reduced to 28 by 28. Then MaxPooling further reduced the dimensions to 14 by 14.
Hence, when i apply Convolution again, should my dimensions not be reduced to 9 by 9?
Therefore, should the input to self.fc1 not be 16 * 9 * 9 ?