Maxpool2d output size inconsistent with formula

MaazJamal · June 23, 2020, 8:23pm

According to the equation here . The output from maxpool2d should be 24 in my case, but i am not getting that result. Am i missing something here?

class mnist_conv2d(nn.Module):
  
  def __init__(self,classes):
    super(mnist_conv2d,self).__init__()
    self.conv1 = nn.Conv2d(in_channels=1,out_channels=16,kernel_size=3) # 1 in channel as the are greyscale
    self.conv2 = nn.Conv2d(in_channels=16,out_channels=32,kernel_size=3)
    self.drop = nn.Dropout2d()
    self.fc1 = nn.Linear(in_features=128,out_features=50) 
    self.fc2 = nn.Linear(in_features=50,out_features=classes)
  
  def forward(self, x):
    print("Original: ", x.size())
    x = self.conv1(x)
    print("Conv2d: ", x.size())
    #x size = (28+2*0-1*(3-1)-1)/1 + 1 = 26 
    x = F.max_pool2d(x,kernel_size=3)
    print("Maxpool: ", x.size())
    #x size = 26-(3-1)-1/1 + 1 = 24
    x = F.relu(x)
    print("Relu: ", x.size())
    x = self.conv2(x)
    print("Conv2d: ",x.size())
    #x size = 24-(3-1)-1/1 + 1 = 22
    x = self.drop(x)
    print("Dropout: ",x.size())
    x = F.max_pool2d(x,kernel_size=3)
    print("Maxpool: ",x.size())
    #x size = 24-(3-1)-1/1 + 1 = 20
    x = F.relu(x)
    print("Relu: ",x.size())
    x = x.view(x.shape[0],-1)
    print("Flattened: ",x.size()) 
    x = self.fc1(x)
    x = F.relu(x)
    x = self.fc2(x)
    x = F.log_softmax(x,dim=1)
    return x

Passing a 1,1,28,28 image or tensor to this model results in this output:

Original:  torch.Size([1, 1, 28, 28])
Conv2d:  torch.Size([1, 16, 26, 26])
Maxpool:  torch.Size([1, 16, 8, 8])
Relu:  torch.Size([1, 16, 8, 8])
Conv2d:  torch.Size([1, 32, 6, 6])
Dropout:  torch.Size([1, 32, 6, 6])
Maxpool:  torch.Size([1, 32, 2, 2])
Relu:  torch.Size([1, 32, 2, 2])
Flattened:  torch.Size([1, 128])

Shouldnt the output from the maxpool layers be of size 24x24 and not 8x8?

MaazJamal · June 23, 2020, 8:30pm

Nevermind I missed this part from the docs:

stride – the stride of the window. Default value is kernel_size

So the equation is (26+20-1(3-1)-1)/3 + 1 = 8
and not (26+20-1(3-1)-1)/1 + 1 = 24
The stride in denominator is 3 not 1. As is the case for conv2d layers.