Hi everyone,
First post here. Having trouble finding the right resources to understand how to calculate the dimensions required to transition from conv block, to linear block. I have seen several equations which I attempted to implement unsuccessfully:
- βThe formula for output neuron:
Output = ((I-K+2P)/S + 1), where
I - a size of input neuron,
K - kernel size,
P - padding,
S - stride.β
and
- βπβ²=(πβπΉ+2π/π)+1β
The example network that I have been trying to understand is a CNN for CIFAR10 dataset
Below is the third conv layer block, which feeds into a linear layer w/ 4096 as input:
# Conv Layer block 3
nn.Conv2d(in_channels=128, out_channels=256, kernel_size=3, padding=1),
nn.BatchNorm2d(256),
nn.ReLU(inplace=True),
nn.Conv2d(in_channels=256, out_channels=256, kernel_size=3, padding=1),
nn.ReLU(inplace=True),
nn.MaxPool2d(kernel_size=2, stride=2),
)
self.fc_layer = nn.Sequential(
nn.Dropout(p=0.1),
nn.Linear(4096, 1024),
nn.ReLU(inplace=True),
nn.Linear(1024, 512),
nn.ReLU(inplace=True),
nn.Dropout(p=0.1),
nn.Linear(512, 10)
)
I need to figure out the equations/resources/protocol to calculate this transition between Conv and linear. How did we arrive at 4096?
EDIT: I have also used ptrblckβs print-layer (below) for help, but still struggle to understand this transition intuitively.
class Print(nn.Module):
def forward(self, x):
print(x.size())
return x
Any and all help greatly appreciated, Dan
cited: