Pretrained learning with different input size

neuralix · March 16, 2017, 11:12am

I tried to train alexnet in transfer learning with ‘–pretrained’ option
with increasing input size 224x224–>512x512.
with chainging variables in ‘alexnet.py’ like this.

nn.Linear(256 * 15 * 15, 4096) in __init__(...)
and
x = x.view(x.size(0), 256 * 15 * 15) in forward()

The non-pretraining case works, but pretraining one doesn’t.
Is it a limitation of current version? (may be from new graph structure…?)

Error Message in pretraining case:

Traceback (most recent call last):
  File "main.py", line 314, in <module>
    main()
  File "main.py", line 68, in main
    model = models.__dict__[args.arch](pretrained=True)
  File "/home/dylee/.conda/envs/pytorch/lib/python2.7/site-packages/torchvision/models/alexnet.py", line 57, in alexnet
    model.load_state_dict(model_zoo.load_url(model_urls['alexnet']))
  File "/home/dylee/.conda/envs/pytorch/lib/python2.7/site-packages/torch/nn/modules/module.py", line 315, in load_state_dict
    own_state[name].copy_(param)
RuntimeError: inconsistent tensor size at /data/users/soumith/miniconda2/conda-bld/pytorch-cuda80-0.1.10_1488756735684/work/torch/lib/TH/generic/THTensorCopy.c:51

apaszke · March 16, 2017, 3:27pm

No, it’s supported. You must have some bug in your code.

neuralix · March 17, 2017, 2:47am

Thanks, your decisive answer was helped.
I fixed this problem like this.

in def load_state_dict(self, state_dict)
(in ...{pytorch}/lib/python2.7/site-packages/torch/nn/modules/module.py)
FROM

    own_state[name].copy_(param)

TO

if name != 'classifier.1.weight' and name != 'classifier.1.bias':
    own_state[name].copy_(param)

By skipping parameter copying.

Though it works in functionality,
I included classifier.4 and classifier.6 also for clearing classifiers.

I hope it’s helpful for someone.