Output size is too small at SpatialDilatedMaxPooling.c:67 error

I am new to Pytorch. I am trying to implement basic CIFAR-10 classification as given in the cifar-1o tutorail at PyTorch’s website. It worked fine and gave me an accuracy of 56% like in the tutorial but then I added another layer and it gave me this error:

Given input size: (32x1x1). Calculated output size: (32x0x0). Output size is too small at /opt/conda/conda-bld/pytorch_1503966894950/work/torch/lib/THNN/generic/SpatialDilatedMaxPooling.c:u6708:

I know something similar has been asked but I couldn’t solve this problem.

My network looks like this:

class Net(nn.Module):
def __init__(self):
    super(Net, self).__init__()
    self.conv1 = nn.Conv2d(3, 6, 5)
    self.conv2 = nn.Conv2d(6, 16, 5)
    self.pool = nn.MaxPool2d(2, 2)
    self.conv3 = nn.Conv2d(16,32,5)
    self.fc1 = nn.Linear(32 * 5 * 5, 120)
    self.fc2 = nn.Linear(120, 84)
    self.fc3 = nn.Linear(84, 10)

def forward(self, x):
    x = self.pool(F.relu(self.conv1(x)))
    x = self.pool(F.relu(self.conv2(x)))
    x = self.pool(F.relu(self.conv3(x)))
    x = x.view(-1, 32 * 5* 5)
    x = F.relu(self.fc1(x))
    x = F.relu(self.fc2(x))
    x = self.fc3(x)
    return x

conv3 is the new layer
What am i missing?


The problem is that the output of your conv3 is 32x1x1 (32 channel, 1 height, 1 width), and you then try to apply a pooling layer to it with a kernel of size 2x2. But the input is too small because you would need the height and width to be at least 2.

Thanks for the reply. But, I don’t understand how do I know that the output of conv3 is 32x1x1? And how do I solve this error?

The error message states: Given input size: (32x1x1). Calculated output size: (32x0x0) inside the ***MaxPooling.c file.
So the problem is that the input of the max pooling is too small. Since the number of channels is 32, it has to be after conv3 because only conv3 in your case output 32 channels. And there is a single pooling layer after conv3.

1 Like

Hey there,

Your convolutional layers and pooling layers are reducing channel size so that you end up having 1x1 channels as the max pooling layer input.

It would probably help to manually track down the output size of your convolutional/pooling layers.
You could try printing output sizes and/or calculating how each layer affects your input’s dimensions to better understand this error and why it occurred.