Size mismatch between inputs and the Linear layer

abhinavankur · April 29, 2020, 1:49pm

I’ll keep this brief. These are the shapes for the input I am passing to my network.

print(images.shape)
print(images.view(images.shape[0], -1).shape)
print([len(image_datasets[key]) for key in image_datasets.keys()])

torch.Size([32, 3, 224, 224])
torch.Size([32, 150528])
[6552, 818, 819]

The first layer in my classifier, within VGG16 network, expects 150528 inputs for this very reason. However, when I pass the same after reshaping it, I get the following error in the forward pass:

RuntimeError: size mismatch, m1: [32 x 25088], m2: [150528 x 7168]

I really don’t understand where 25088 comes from. Can someone please help?

ptrblck · April 30, 2020, 4:45am

Usually you are using pooling layers inside these standard models and VGG16 uses them as well.
You can check if by printing the architecture via print(model).
These layers will decrease the spatial size of the activations (which can also be done in conv layers).
Also the activations will have a different number of channels, which is defined by the number of filters in each convolution.

The 25088 is the number of input features, as the last activation, which is passed to the first linear layer has the shape [batch_size, 512, 7, 7].

For more information about convolutions etc. I would recommend to take a look at CS231n - CNN.

abhinavankur · April 30, 2020, 7:50am

Thanks @ptrblck. I realized this after posting the question. Anyway, I have a new issue now on the very same lines. When I try to do a prediction on a single image, after preprocessing it the way PyTorch Transforms are doing, I get a size mismatch error. Exemplifying:
This is the shape of one batch from the dataloader: torch.Size([32, 3, 224, 224])
This is the shape of the image after preprocessing it: torch.Size([3, 224, 224])

When I try to predict this, I get the following error:
RuntimeError: expected stride to be a single integer value or a list of 1 values to match the convolution dimensions, but got stride=[1, 1]

Looking at your answer on other threads, I did add unsqueeze_(0) which leads to this error:
ValueError: Expected input batch_size (1) to match target batch_size (32)

Please help me with this.

P.S. I understand if one solution is to increase the size in 0th dimension to 32 (Please show me how to do it) but I am not able to understand why PyTorch cannot recognize the batch size to be independent of the input. Why does it care about the batch size?

ptrblck · April 30, 2020, 11:09pm

Could you post the preprocessing code, please?
Somehow the batch dimension was reduced to a single example, which should be wrong.

abhinavankur · May 4, 2020, 12:30pm

@ptrblck, I am taking the liberty of sharing the notebook itself so that I don’t miss anything. Obviously, I am doing a blunder which I am not able to comprehend. It is well documented and I hope this wouldn’t waste much of your time. The error is in the cell numbered 21.

github.com

abhinavankur/udacity-ml-nd/blob/master/image_classifier/Image Classifier Project.ipynb

{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# Developing an AI application\n",
    "\n",
    "Going forward, AI algorithms will be incorporated into more and more everyday applications. For example, you might want to include an image classifier in a smart phone app. To do this, you'd use a deep learning model trained on hundreds of thousands of images as part of the overall application architecture. A large part of software development in the future will be using these types of models as common parts of applications. \n",
    "\n",
    "In this project, you'll train an image classifier to recognize different species of flowers. You can imagine using something like this in a phone app that tells you the name of the flower your camera is looking at. In practice you'd train this classifier, then export it for use in your application. We'll be using [this dataset](http://www.robots.ox.ac.uk/~vgg/data/flowers/102/index.html) of 102 flower categories, you can see a few examples below. \n",
    "\n",
    "<img src='assets/Flowers.png' width=500px>\n",
    "\n",
    "The project is broken down into multiple steps:\n",
    "\n",
    "* Load and preprocess the image dataset\n",
    "* Train the image classifier on your dataset\n",
    "* Use the trained classifier to predict image content\n",
    "\n",

This file has been truncated. show original

abhinavankur · May 4, 2020, 4:46pm

Hi @ptrblck, there’s one more thing I need to point out. The error ValueError: Expected input batch_size (1) to match target batch_size (32) occurs when I am loading the checkpointed model. However, when I try to train the model and then use that model object for inference I get the same error but with different batch size. ValueError: Expected input batch_size (1) to match target batch_size (51).. I am linking the notebook again with this error in the state.

github.com

abhinavankur/udacity-ml-nd/blob/master/image_classifier/Image Classifier Project_Training_Error.ipynb

{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# Developing an AI application\n",
    "\n",
    "Going forward, AI algorithms will be incorporated into more and more everyday applications. For example, you might want to include an image classifier in a smart phone app. To do this, you'd use a deep learning model trained on hundreds of thousands of images as part of the overall application architecture. A large part of software development in the future will be using these types of models as common parts of applications. \n",
    "\n",
    "In this project, you'll train an image classifier to recognize different species of flowers. You can imagine using something like this in a phone app that tells you the name of the flower your camera is looking at. In practice you'd train this classifier, then export it for use in your application. We'll be using [this dataset](http://www.robots.ox.ac.uk/~vgg/data/flowers/102/index.html) of 102 flower categories, you can see a few examples below. \n",
    "\n",
    "<img src='assets/Flowers.png' width=500px>\n",
    "\n",
    "The project is broken down into multiple steps:\n",
    "\n",
    "* Load and preprocess the image dataset\n",
    "* Train the image classifier on your dataset\n",
    "* Use the trained classifier to predict image content\n",
    "\n",

This file has been truncated. show original

ptrblck · May 5, 2020, 5:43am

The error in cell20/21 is created by the wrong usage of the labels tensor:

img = img.to(device)
log_ps = model(img.unsqueeze(0))
test_loss = criterion(log_ps, labels)

While you are loading a single image and unsqueeze the batch dimension, which is fine, you are reusing labels from previous cells (most likely from the test DataLoader).
Instead of reusing labels you should either load the target for the current image you would like to classify or skip the loss calculation, if the target is unknown.

I think the second error might be related.
Let me know, if that helps or if you need more information.

abhinavankur · May 5, 2020, 8:28am

Thanks a lot @ptrblck. Frankly, I am disappointed in myself to see doing such errors and on top of it unable to find them as well.

I think this solves the problem for now. If need be, I’ll engage you again. Thanks once again!

ptrblck · May 6, 2020, 5:23am

You shouldn’t be disappointed for making such errors.
You can’t imagine how often I’m still running into these things.