Input/output size runtime error while doing Transfer Learning on CIFAR10

Nimrod_Daniel · December 9, 2018, 10:40am

Hi,

I’m doing a transfer learning to a pretrained network on cifar 10, but I get a very odd runtime error, even though it seems like there’s no problem with input and output sizes.

Basically all I did was just changing the network and optimizer section from a working code. It’s very odd considering all I did was relying on code from Udacity. PyTorch’s tutorial shows a very similar way to do it, tried that too and got the same error.

That’s the section, very simple:

model = models.densenet121(pretrained=True)
for param in model.parameters():
    param.requires_grad = False    

model.classifier = nn.Sequential(nn.Linear(1024, 256),
                                  nn.ReLU(),
                                  nn.Dropout(0.2),
                                  nn.Linear(256, 10),
                                  nn.LogSoftmax(dim=1)) 

if torch.cuda.is_available():
    model.cuda()

criterion = nn.NLLLoss()
optimizer = optim.Adam(model.classifier.parameters(), lr=0.01)

For ease, that’s the classifying layer from dense121:
(classifier): Linear(in_features=1024, out_features=1000, bias=True).

All I did was just replacing (1024,1000) with something else. Any idea why I get the following error?
(python 3.6, pytorch 0.4.0)

RuntimeError: Given input size: (1024x1x1). Calculated output size: (1024x-5x-5). Output size is too small at /opt/conda/conda-bld/pytorch_1525909934016/work/aten/src/THCUNN/generic/SpatialAveragePooling.cu:63

ptrblck · December 9, 2018, 11:32pm

Did you somehow manipulate the forward method of your model, since your code works fine on my machine using random input:

output = model(torch.randn(1, 3, 224, 224))
output.mean().backward()
optimizer.step()

Nimrod_Daniel · December 10, 2018, 7:25am

No, the forward is already defined in the model, I didn’t change anything in the model. It’s a simple code. Basically, all I have is a data-loader, the model, train and test.

I need to mention another warning
" “/home/nimrod/anaconda3/lib/python3.6/site-packages/torchvision/models/densenet.py”, line 222, in forward
out = F.avg_pool2d(out, kernel_size=7, stride=1).view(features.size(0), -1)"
Maybe it’s all because the images are too small. I see this warning only with dense121.

On ResNet18 I get:
"
File “/home/nimrod/anaconda3/lib/python3.6/site-packages/torch/nn/modules/pooling.py”, line 547, in forward
self.padding, self.ceil_mode, self.count_include_pad)"

It leads me to believe that maybe the input/output size runtime error I had with resnet18/50, and densenet121 is because I trained them on 3232 images. After all, I use networks that were trained on Imagenet with kernel sizes that might be bigger than some layer’s output. On Udacity they use cats vs. dogs, on pytoch you have heymenoptera, both are not too small. I ran the heymenoptera’s code (all of it) from the pytorch tutorial and it worked well. And you used 224224 input.

I think that if I want to do transfer learning on small images then I need to reduce the kernel size (and check padding) of such networks, and it makes sense that it requires some tweaking. It requires a close look at the network I work with. That’s probably the reason why I had that problem.

It looks like I wanted to learn how to do TL the quickest way (load cifar10), but learned a good lesson