Input and target shapes do not match in a pre-trained model

I’m using a pre-trained Resnet model torchvision.models.resnet18(pretrained=True) to classify flower images, the dataset contains 112 classes and I’m iterating over the DataLoader image by image:

for count, (image, label) in enumerate(train):
    print(image.size(), type(image), label.size(), type(label))  # torch.Size([1, 3, 224, 224]) <class 'torch.Tensor'> torch.Size([1]) <class 'torch.Tensor'>

When I calculate the loss criterion(output.float(), label.float()) I receive the following error:

RuntimeError: input and target shapes do not match: input [1 x 1000], target [1] at /opt/conda/conda-bld/pytorch_1533672544752/work/aten/src/THNN/generic/MSECriterion.c:12

How can I fix this error and still using the pre-trained model ? (the optimizer is ADAM and the loss function is MSELoss)

The model you are loading assumes there
Are 1000 classes (imagenet classes)
You just have to rewrite the last fully connected layer to match you amount of classes

Because you are using batch size 1 and your output is

1x102
however you are probably setting your ground truth as an array of 102 values.

Since your input is 1x102 and your gt is just 102 they don’t match.

You have to unsqueeze your gt

Sorry, I did not understand what you have said, I need to unsqueeze what exactly ? I have tested other loss functions and again, it works fine. I have set the epochs to 102 and I receive the same error, here’s my notebook, I had need to put the notebook on Google Colab for memory reasons, so this is a very simular version of the current notebook.

I guess you simply have an issue about dimensionality.
Check this example:

import torch

my_loss=torch.nn.L1Loss()

a1=torch.rand(1,10)

a2 = torch.rand(10)

my_loss(a1,a2)
Traceback (most recent cal
RuntimeError: input and target shapes do not match: input [1 x 10], target [10] at /pytorch/aten/src/THNN/generic/AbsCriterion.c:12

Both are a tensor of 10 elements. However, a1 is of size 1x10 meanwhile a2 is of size 10.

PyTorch usually assign an additional dimension for the batch. That’s why your output is 1x102 and not just 102.
However, ground-truth is handled by you. If you don’t assign (unsqueeze) your ground-truth to match the outgoing dimensionality it will throw the error you saw.