PyTorch -CIFAR 10 Example - FC Layer Inputs

Hello. New here and new to ML!

I have a query concerning the PyTorch CIFAR 10 Example. Please bear with me and the potential mistakes i’ll be making. In CIFAR 10, we defined the class NET as follows:

class Net(nn.Module):
def init(self):
super(Net, self).init()
self.conv1 = nn.Conv2d(in_channels=3, out_channels=6, kernel_size=5)
self.pool = nn.MaxPool2d(2, 2)
self.conv2 = nn.Conv2d(in_channels=6, out_channels=16, kernel_size=5)
self.fc1 = nn.Linear(16 * 5 * 5, 120)
self.fc2 = nn.Linear(120, 84)
self.fc3 = nn.Linear(84, 10)

def forward(self, x):
    x = self.pool(F.relu(self.conv1(x)))
    x = self.pool(F.relu(self.conv2(x)))
    x = x.view(-1, 16 * 5 * 5)
    x = F.relu(self.fc1(x))
    x = F.relu(self.fc2(x))
    x = self.fc3(x)
    return x

My question specifically concerns: self.fc1 = nn.Linear(16 * 5 * 5, 120).
I understand we have 16 activation maps but how can each activation map be 5 by 5?
Considering the orignal image which is 32 by 32, i applied convolution on it and its dimensions reduced to 28 by 28. Then MaxPooling further reduced the dimensions to 14 by 14.
Hence, when i apply Convolution again, should my dimensions not be reduced to 9 by 9?

Therefore, should the input to self.fc1 not be 16 * 9 * 9 ?

Thank You.
Best Regards!

Hi, in the forward() function, you can find that the self.pool() is applied twice.

def forward(self, x):
    x = self.pool(F.relu(self.conv1(x)))
    x = self.pool(F.relu(self.conv2(x)))      # <- Here
    x = x.view(-1, 16 * 5 * 5)
    x = F.relu(self.fc1(x))
    x = F.relu(self.fc2(x))
    x = self.fc3(x)
    return x

So the output size you’re dealing with would be 16*5*5

OH! I missed that big.

Thank you very much! Appreciate the help!

1 Like