First post here. Having trouble finding the right resources to understand how to calculate the dimensions required to transition from conv block, to linear block. I have seen several equations which I attempted to implement unsuccessfully:
- “The formula for output neuron:
Output = ((I-K+2P)/S + 1), where
I - a size of input neuron,
K - kernel size,
P - padding,
S - stride.”
The example network that I have been trying to understand is a CNN for CIFAR10 dataset
Below is the third conv layer block, which feeds into a linear layer w/ 4096 as input:
# Conv Layer block 3 nn.Conv2d(in_channels=128, out_channels=256, kernel_size=3, padding=1), nn.BatchNorm2d(256), nn.ReLU(inplace=True), nn.Conv2d(in_channels=256, out_channels=256, kernel_size=3, padding=1), nn.ReLU(inplace=True), nn.MaxPool2d(kernel_size=2, stride=2), ) self.fc_layer = nn.Sequential( nn.Dropout(p=0.1), nn.Linear(4096, 1024), nn.ReLU(inplace=True), nn.Linear(1024, 512), nn.ReLU(inplace=True), nn.Dropout(p=0.1), nn.Linear(512, 10) )
I need to figure out the equations/resources/protocol to calculate this transition between Conv and linear. How did we arrive at 4096?
EDIT: I have also used ptrblck’s print-layer (below) for help, but still struggle to understand this transition intuitively.
def forward(self, x):
Any and all help greatly appreciated, Dan