RuntimeError: size mismatch at

My neural network architecture is as follows:

CONV1 : 3 inputs, 64 outputs
POOL1 : 2x2 maxpool 
CONV2 : 64 inputs, 128 outputs
POOL2: 2x2 maxpool 
CONV3: 128 inputs, 256 outputs
CONV4: 256 inputs, 256 outputs
POOL3: 2x2 maxpool 
FC1: 256 inputs, 1024 outputs
FC2: 1024 inputs, 1024 outputs
Batch norm: input= FC2 
SOFTMAX: 1024 inputs, 10 outputs!

My network code:


class Net(nn.Module):
    def __init__(self):
        super(Net, self).__init__()
        self.conv1 = nn.Conv2d(3, 64, kernel_size=3, stride=1, padding=1)
        self.conv2 = nn.Conv2d(64, 128, kernel_size=3, stride=1, padding=1)
        self.conv3 = nn.Conv2d(128, 256, kernel_size=3, stride=1, padding=1)
        self.conv4 = nn.Conv2d(256, 256, kernel_size=3, stride=1, padding=1)
        self.fc1   = nn.Linear(256,1024)
        self.fc2   = nn.Linear(1024,1024)
        self.fc2_bn= nn.BatchNorm2d(1024)
        self.classifier = nn.Linear(1024, 10)

    def forward(self, x):
        x = F.relu(F.max_pool2d(self.conv1(x), kernel_size=2, stride=2))
        x = F.relu(F.max_pool2d(self.conv2(x),kernel_size=2, stride=2))
        x = F.relu(self.conv3(x))
        x = F.relu(F.max_pool2d(self.conv4(x), kernel_size=2, stride=2))
        x = x.view(x.size(0), -1)
        x = F.relu(self.fc1(x))
        x = F.relu(self.fc2_bn(self.fc2(x)))
        x = F.relu(self.classifier(x))
        return x

I think the problem is with x = x.view(x.size(0), -1). Can someone help me solve this problem ? What is the correct code ?

Let’s work through the sizes:

The output of the POOL3 layer has size (1, 256, 1, 1).
The output of the view is (1, 256)
The output of the FC1 layer is (1, 256)
The output of the FC2 layer is (1, 1024)
It doesn’t make sense to BatchNorm2d over something of size (1, 1024). Perhaps you meant BatchNorm1d?

My image has 3 channels. So it makes sense to use batchnorm2d right ? And image dimensions are 32x32

Not between linear layers, since they are 1-dimensional.
Also, you should probably change the last non-linearity to F.log_softmax()/F.softmax().

I think self.fc1 should be nn.Linear(256x4x4,1024)
Instead of what is used in the code. Am I correct ?

If your image dimensions are 32x32 at the beginning, 3 MaxPoolLayers with your setup will reduce it to 4x4.
So yes, your fc1 layer should probably take 4x4x256 inputs.

The easiest way to get the right size, is to add a print statement in your forward pass and just look at the size. :wink:
At least that’s how I do it, if I don’t want to calculate it.

x = F.relu(...
print x.size()