I have an AlexNet like neural network like:
self.conv = nn.Sequential()
self.conv.add_module('conv1_s1',nn.Conv2d(3, 64, kernel_size=9, stride=2, padding=0))
..
self.conv.add_module('conv2_s1',nn.Conv2d(64, 96, kernel_size=5, padding=2, groups=2))
..
self.conv.add_module('conv3_s1',nn.Conv2d(96, 128, kernel_size=3, padding=1))
..
self.conv.add_module('conv4_s1',nn.Conv2d(128, 128, kernel_size=3, padding=1, groups=2))
..
self.conv.add_module('conv5_s1',nn.Conv2d(128, 96, kernel_size=3, padding=1, groups=2))
self.fc6 = nn.Sequential()
self.fc6.add_module('fc6_s1',nn.Linear(96*2*2, 256))
..
..
The output of conv5_s1 is 96 channels of size 2x2 (see fc6_s1 after flattening). Does it make sense to have such a high number of channels for a small output dimension of 2x2?. So, the same feature map will be produced many times. Would 24 channels in layer conv5_s1 be sufficient, since 2x2=4 factorial = 24 possibilities? Or is this a wrong understanding?
Thanks for help!