TorchVision Resnet18 model weird output CIFAR10

I am trying to use pretrained ResNet18 model in TorchVision, but not sure what is the right way to transform the image, so that its compatible with ResNet 18.

First I tried with this transformation :

    transform = torchvision.transforms.Compose([
       torchvision.transforms.ToTensor(),
       normalize
       ])

which is giving me following error :

RuntimeError: Given input size: (512x1x1). Calculated output size: (512x-5x-5). Output size is too small at /pytorch/aten/src/THNN/generic/SpatialAveragePooling.c:64

After that, I looked for few codes on GitHub and changed that to this :

    transform = transforms.Compose([
        transforms.Resize(256),
        transforms.CenterCrop(224),
        transforms.ToTensor(),
        transforms.Normalize([0.485, 0.456, 0.406], [0.229, 0.224, 0.225])
    ])

While this one runs, the predicted output classes index are more than 10 for CIFAR10.
tensor([338, 613, 536, 772, 59, 260, 854, 737, 806, 414])

Can anyone point me to what is the right Transform to use for this one ?

Probably, you may need to refer related to the datasets.

  • ResNet-xx models are trained for ImageNet data, which expects images of size 224x224 and predicts output for 1000 classes.
  • CIFAR10 data has 32x32 images and 10 classes.

The first error in your post is due to the size mismatch.
In the second behavior, it is predicting class probabilities (to be specific, logits) for 1000 classes of ImageNet.

If you want to predict for CIFAR10, use some pretrained models for CIFAR10.
This repo might help:

1 Like