Using custom dataset for training a classifier, RuntimeError: at outputs = net(Variable(images))

isalirezag · August 19, 2017, 12:55am

Hello,

I see different people ask questions regarding using their own datasets for trainng.
I read the tutorials in “transfer learning” and “training a classifier”, and based on them I made a simple classifier that trains on a costum data (hymenoptera_data), (code is provided at the end of this question).
My code works fine as long as I scale images to 32 (using transforms.Scale). However, as soon as I change the size of images to anything else (e.g. 64). My code gives me the following error, where the error happens at outputs = net(Variable(images))
:

RuntimeError Traceback (most recent call last)
in ()
13
14 # forward + backward + optimize
—> 15 outputs = net(inputs)
16 loss = criterion(outputs, labels)
17 loss.backward()

/usr/local/lib/python2.7/dist-packages/torch/nn/modules/module.pyc in call(self, *input, **kwargs)
204
205 def call(self, *input, **kwargs):
→ 206 result = self.forward(*input, **kwargs)
207 for hook in self._forward_hooks.values():
208 hook_result = hook(self, input, result)

in forward(self, x)
17 x = self.pool(F.relu(self.conv1(x)))
18 x = self.pool(F.relu(self.conv2(x)))
—> 19 x = x.view(-1, 16 * 5 * 5)
20 x = F.relu(self.fc1(x))
21 x = F.relu(self.fc2(x))

/usr/local/lib/python2.7/dist-packages/torch/autograd/variable.pyc in view(self, *sizes)
469
470 def view(self, *sizes):
→ 471 return View(*sizes)(self)
472
473 def view_as(self, tensor):

/usr/local/lib/python2.7/dist-packages/torch/autograd/_functions/tensor.pyc in forward(self, i)
96 def forward(self, i):
97 self.input_size = i.size()
—> 98 result = i.view(*self.sizes)
99 self.mark_shared_storage((i, result))
100 return result

RuntimeError: size ‘[-1 x 400]’ is invalid for input of with 5408 elements at /b/wheel/pytorch-src/torch/lib/TH/THStorage.c:55

Can anyone please help me underrstand what is going on and how I can fix it?

Code:

%matplotlib inline
from __future__ import print_function, division
import torch
import torch.nn as nn
import torch.optim as optim
from torch.autograd import Variable
import numpy as np
import torchvision
from torchvision import datasets, models, transforms
import matplotlib.pyplot as plt
import time
import copy
import os

plt.ion()   # interactive mode


# Just normalization for validation
ImageSize=64
data_transforms = {
    'train': transforms.Compose([
                transforms.Scale(ImageSize+2),
        transforms.CenterCrop(ImageSize),
        # PIL.Image randomly with a probability of 0.5.
        transforms.ToTensor(), #Convert a PIL.Image or numpy.ndarray to tensor. 
        #Converts a PIL.Image or numpy.ndarray (H x W x C) in the range [0, 255] 
        # to a torch.FloatTensor of
        # shape (C x H x W) in the range [0.0, 1.0].
        transforms.Normalize([0.485, 0.456, 0.406], [0.229, 0.224, 0.225])
    ]),
    'test': transforms.Compose([
        transforms.Scale(ImageSize+2),
        transforms.CenterCrop(ImageSize),
        transforms.ToTensor(),
        transforms.Normalize([0.485, 0.456, 0.406], [0.229, 0.224, 0.225])
    ]),
}


data_dir = 'hymenoptera_data'
dsets = {x: datasets.ImageFolder(os.path.join(data_dir, x), data_transforms[x])
         for x in ['train', 'test']}

dset_loaders = {x: torch.utils.data.DataLoader(dsets[x], batch_size=2,
                                               shuffle=True, num_workers=0)
                for x in ['train', 'test']}
dset_sizes = {x: len(dsets[x]) for x in ['train', 'test']}
dset_classes = dsets['train'].classes

use_gpu = False
torch.cuda.is_available()

# Training model:

from torch.autograd import Variable
import torch.nn as nn
import torch.nn.functional as F


class Net(nn.Module):
    def __init__(self):
        super(Net, self).__init__()
        self.conv1 = nn.Conv2d(3, 6, 5)
        self.pool = nn.MaxPool2d(2, 2)
        self.conv2 = nn.Conv2d(6, 16, 5)
        self.fc1 = nn.Linear(16 * 5 * 5, 120)
        self.fc2 = nn.Linear(120, 84)
        self.fc3 = nn.Linear(84, 2)

    def forward(self, x):
        x = self.pool(F.relu(self.conv1(x)))
        x = self.pool(F.relu(self.conv2(x)))
        x = x.view(-1, 16 * 5 * 5)
        x = F.relu(self.fc1(x))
        x = F.relu(self.fc2(x))
        x = self.fc3(x)
        return x


net = Net()

import torch.optim as optim

criterion = nn.CrossEntropyLoss()
optimizer = optim.SGD(net.parameters(), lr=0.001, momentum=0.9)

# Training:

for epoch in range(2):  # loop over the dataset multiple times

    running_loss = 0.0
    for i, data in enumerate((dset_loaders['train']),0):
        # get the inputs
        inputs, labels = data
#         print(i)
        # wrap them in Variable
        inputs, labels = Variable(inputs), Variable(labels)

        # zero the parameter gradients
        optimizer.zero_grad()

        # forward + backward + optimize
        outputs = net(inputs)
        loss = criterion(outputs, labels)
        loss.backward()
        optimizer.step()

        # print statistics
        running_loss += loss.data[0]
        if i % 15 == 4:    # print every 2000 mini-batches
            print('[%d, %5d] loss: %.3f' %
                  (epoch + 1, i + 1, running_loss / 2000))
            running_loss = 0.0

print('Finished Training')

Crazyai · August 21, 2017, 1:42am

Once you change the input size of image, the size of feature map which is the output of the last conv layer changes. So the view operation fails.

The error RuntimeError: size ‘[-1 x 400]’ is invalid for input of with 5408 elements at /b/wheel/pytorch-src/torch/lib/TH/THStorage.c:55 raises because 5408 (the actual feature map size) can’t be divided by 400 (fully connected layer size you defined).

isalirezag · August 22, 2017, 5:05pm

but even I guess a 32x32 image also cannot be divided by 400.
Do you have any suggestion on how to change the code so it works for 64x64 input?
Thanks

Crazyai · August 23, 2017, 1:37am

isalirezag:

class Net(nn.Module):
def init(self):
super(Net, self).init()
self.conv1 = nn.Conv2d(3, 6, 5)
self.pool = nn.MaxPool2d(2, 2)
self.conv2 = nn.Conv2d(6, 16, 5)
self.fc1 = nn.Linear(16 * 5 * 5, 120)
self.fc2 = nn.Linear(120, 84)
self.fc3 = nn.Linear(84, 2)
def forward(self, x):
    x = self.pool(F.relu(self.conv1(x)))
    x = self.pool(F.relu(self.conv2(x)))
    x = x.view(-1, 16 * 5 * 5)
    x = F.relu(self.fc1(x))
    x = F.relu(self.fc2(x))
    x = self.fc3(x)
    return x

After transformation of convs and poolings, the size of the feature map is 16x5x5 for a 3x32x32 input image.

isalirezag · August 23, 2017, 3:19am

Ohhhh Got it!!! Thanks

isalirezag · October 28, 2017, 9:06pm

Maybe this can be helpful : LINK