Not able to train on imagenet using the inception architecture

Hi all!

I tried training inception architecture for imagenet as given here Imagenet training code but it throws error. Probably, problem is in the code of inception becuase for other architecture the main.py works as expected.

Here is the traceback -

Traceback (most recent call last):
  File "main.py", line 316, in <module>
    main()
  File "main.py", line 158, in main
    train(train_loader, model, criterion, optimizer, epoch)
  File "main.py", line 195, in train
    output = model(input_var)
  File "/home/abhishs8/Research/Misc_Experiments/Mystuff/.torch-env/lib/python3.6/site-packages/torch/nn/modules/module.py", line 325, in __call__
    result = self.forward(*input, **kwargs)
  File "/home/abhishs8/Research/Misc_Experiments/Mystuff/.torch-env/lib/python3.6/site-packages/torch/nn/modules/container.py", line 67, in forward
    input = module(input)
  File "/home/abhishs8/Research/Misc_Experiments/Mystuff/.torch-env/lib/python3.6/site-packages/torch/nn/modules/module.py", line 325, in __call__
    result = self.forward(*input, **kwargs)
  File "/home/abhishs8/Research/Misc_Experiments/Mystuff/.torch-env/lib/python3.6/site-packages/torchvision/models/inception.py", line 109, in forward
    aux = self.AuxLogits(x)
  File "/home/abhishs8/Research/Misc_Experiments/Mystuff/.torch-env/lib/python3.6/site-packages/torch/nn/modules/module.py", line 325, in __call__
    result = self.forward(*input, **kwargs)
  File "/home/abhishs8/Research/Misc_Experiments/Mystuff/.torch-env/lib/python3.6/site-packages/torchvision/models/inception.py", line 308, in forward
    x = self.conv1(x)
  File "/home/abhishs8/Research/Misc_Experiments/Mystuff/.torch-env/lib/python3.6/site-packages/torch/nn/modules/module.py", line 325, in __call__
    result = self.forward(*input, **kwargs)
  File "/home/abhishs8/Research/Misc_Experiments/Mystuff/.torch-env/lib/python3.6/site-packages/torchvision/models/inception.py", line 327, in forward
    x = self.conv(x)
  File "/home/abhishs8/Research/Misc_Experiments/Mystuff/.torch-env/lib/python3.6/site-packages/torch/nn/modules/module.py", line 325, in __call__
    result = self.forward(*input, **kwargs)
  File "/home/abhishs8/Research/Misc_Experiments/Mystuff/.torch-env/lib/python3.6/site-packages/torch/nn/modules/conv.py", line 277, in forward
    self.padding, self.dilation, self.groups)
  File "/home/abhishs8/Research/Misc_Experiments/Mystuff/.torch-env/lib/python3.6/site-packages/torch/nn/functional.py", line 90, in conv2d
    return f(input, weight, bias)
RuntimeError: Given input size: (128 x 3 x 3). Calculated output size: (768 x -1 x -1). Output size is too small at /pytorch/torch/lib/THNN/generic/SpatialConvolutionMM.c:45

It is telling you that the input size to conv is too small. IRIC, the input image size to Inception net is not 224*224. Do you use 224*224 input size for Inception net?

1 Like

Yes, this part of code transforms image size to 224*224 explicitly. Thanks for telling, I’ve changed to 299*299.
Unfortunately, I am getting the following error now -

Traceback (most recent call last):
  File "main.py", line 316, in <module>
    main()
  File "main.py", line 158, in main
    train(train_loader, model, criterion, optimizer, epoch)
  File "main.py", line 196, in train
    loss = criterion(output, target_var)
  File "/home/abhishs8/Research/Misc_Experiments/Mystuff/.torch-env/lib/python3.6/site-packages/torch/nn/modules/module.py", line 325, in __call__
    result = self.forward(*input, **kwargs)
  File "/home/abhishs8/Research/Misc_Experiments/Mystuff/.torch-env/lib/python3.6/site-packages/torch/nn/modules/loss.py", line 601, in forward
    self.ignore_index, self.reduce)
  File "/home/abhishs8/Research/Misc_Experiments/Mystuff/.torch-env/lib/python3.6/site-packages/torch/nn/functional.py", line 1140, in cross_entropy
    return nll_loss(log_softmax(input, 1), target, weight, size_average, ignore_index, reduce)
  File "/home/abhishs8/Research/Misc_Experiments/Mystuff/.torch-env/lib/python3.6/site-packages/torch/nn/functional.py", line 786, in log_softmax
    return torch._C._nn.log_softmax(input, dim)
RuntimeError: log_softmax(): argument 'input' (position 1) must be Variable, not tuple

Without the code, it is hard to tell what is wrong. Also you can try to debug based on the error message, e.g., print the input out?

Basically it is because of the two outputs returned in the output here. Since inception-v3 returns auxiliary classifier’s loss as well. Saw similar discussion on another thread still not clear how to incorporate the changes.

Done!
Problem was that I made changes expecting loss from aux in eval also. My bad. Thanks @jdhao :slight_smile:
For others who get same problem this link may help.
For devs it would be good if necessary if conditions are put in main.py for different models.

3 Likes

I think the actual code here would be more helpful than just a notebook link.

if isinstance(outputs, tuple):
    loss = sum((criterion(o,labels) for o in outputs))
else:
    loss = criterion(outputs, labels)
8 Likes

This worked for me

https://pytorch.org/tutorials/beginner/finetuning_torchvision_models_tutorial.html