Not able to train on imagenet using the inception architecture

tremblerz1 · December 17, 2017, 3:13am

Hi all!

I tried training inception architecture for imagenet as given here Imagenet training code but it throws error. Probably, problem is in the code of inception becuase for other architecture the main.py works as expected.

Here is the traceback -

Traceback (most recent call last):
  File "main.py", line 316, in <module>
    main()
  File "main.py", line 158, in main
    train(train_loader, model, criterion, optimizer, epoch)
  File "main.py", line 195, in train
    output = model(input_var)
  File "/home/abhishs8/Research/Misc_Experiments/Mystuff/.torch-env/lib/python3.6/site-packages/torch/nn/modules/module.py", line 325, in __call__
    result = self.forward(*input, **kwargs)
  File "/home/abhishs8/Research/Misc_Experiments/Mystuff/.torch-env/lib/python3.6/site-packages/torch/nn/modules/container.py", line 67, in forward
    input = module(input)
  File "/home/abhishs8/Research/Misc_Experiments/Mystuff/.torch-env/lib/python3.6/site-packages/torch/nn/modules/module.py", line 325, in __call__
    result = self.forward(*input, **kwargs)
  File "/home/abhishs8/Research/Misc_Experiments/Mystuff/.torch-env/lib/python3.6/site-packages/torchvision/models/inception.py", line 109, in forward
    aux = self.AuxLogits(x)
  File "/home/abhishs8/Research/Misc_Experiments/Mystuff/.torch-env/lib/python3.6/site-packages/torch/nn/modules/module.py", line 325, in __call__
    result = self.forward(*input, **kwargs)
  File "/home/abhishs8/Research/Misc_Experiments/Mystuff/.torch-env/lib/python3.6/site-packages/torchvision/models/inception.py", line 308, in forward
    x = self.conv1(x)
  File "/home/abhishs8/Research/Misc_Experiments/Mystuff/.torch-env/lib/python3.6/site-packages/torch/nn/modules/module.py", line 325, in __call__
    result = self.forward(*input, **kwargs)
  File "/home/abhishs8/Research/Misc_Experiments/Mystuff/.torch-env/lib/python3.6/site-packages/torchvision/models/inception.py", line 327, in forward
    x = self.conv(x)
  File "/home/abhishs8/Research/Misc_Experiments/Mystuff/.torch-env/lib/python3.6/site-packages/torch/nn/modules/module.py", line 325, in __call__
    result = self.forward(*input, **kwargs)
  File "/home/abhishs8/Research/Misc_Experiments/Mystuff/.torch-env/lib/python3.6/site-packages/torch/nn/modules/conv.py", line 277, in forward
    self.padding, self.dilation, self.groups)
  File "/home/abhishs8/Research/Misc_Experiments/Mystuff/.torch-env/lib/python3.6/site-packages/torch/nn/functional.py", line 90, in conv2d
    return f(input, weight, bias)
RuntimeError: Given input size: (128 x 3 x 3). Calculated output size: (768 x -1 x -1). Output size is too small at /pytorch/torch/lib/THNN/generic/SpatialConvolutionMM.c:45

jdhao · December 18, 2017, 6:15am

It is telling you that the input size to conv is too small. IRIC, the input image size to Inception net is not 224*224. Do you use 224*224 input size for Inception net?

tremblerz1 · December 18, 2017, 9:27am

Yes, this part of code transforms image size to 224*224 explicitly. Thanks for telling, I’ve changed to 299*299.
Unfortunately, I am getting the following error now -

Traceback (most recent call last):
  File "main.py", line 316, in <module>
    main()
  File "main.py", line 158, in main
    train(train_loader, model, criterion, optimizer, epoch)
  File "main.py", line 196, in train
    loss = criterion(output, target_var)
  File "/home/abhishs8/Research/Misc_Experiments/Mystuff/.torch-env/lib/python3.6/site-packages/torch/nn/modules/module.py", line 325, in __call__
    result = self.forward(*input, **kwargs)
  File "/home/abhishs8/Research/Misc_Experiments/Mystuff/.torch-env/lib/python3.6/site-packages/torch/nn/modules/loss.py", line 601, in forward
    self.ignore_index, self.reduce)
  File "/home/abhishs8/Research/Misc_Experiments/Mystuff/.torch-env/lib/python3.6/site-packages/torch/nn/functional.py", line 1140, in cross_entropy
    return nll_loss(log_softmax(input, 1), target, weight, size_average, ignore_index, reduce)
  File "/home/abhishs8/Research/Misc_Experiments/Mystuff/.torch-env/lib/python3.6/site-packages/torch/nn/functional.py", line 786, in log_softmax
    return torch._C._nn.log_softmax(input, dim)
RuntimeError: log_softmax(): argument 'input' (position 1) must be Variable, not tuple

jdhao · December 18, 2017, 12:44pm

Without the code, it is hard to tell what is wrong. Also you can try to debug based on the error message, e.g., print the input out?

tremblerz1 · December 18, 2017, 7:48pm

Basically it is because of the two outputs returned in the output here. Since inception-v3 returns auxiliary classifier’s loss as well. Saw similar discussion on another thread still not clear how to incorporate the changes.

tremblerz1 · December 19, 2017, 5:39am

Done!
Problem was that I made changes expecting loss from aux in eval also. My bad. Thanks @jdhao
For others who get same problem this link may help.
For devs it would be good if necessary if conditions are put in main.py for different models.

lysuhin · February 5, 2018, 12:49pm

I think the actual code here would be more helpful than just a notebook link.

if isinstance(outputs, tuple):
    loss = sum((criterion(o,labels) for o in outputs))
else:
    loss = criterion(outputs, labels)

gunjan_deotale · February 20, 2019, 6:53am

This worked for me

https://pytorch.org/tutorials/beginner/finetuning_torchvision_models_tutorial.html