Loading pre-trained BNInception in Pytorch 0.4

mujtabaasif · July 30, 2018, 3:58am

I am not able to load checkpoint from Pytorch 0.3 to Pytorch 0.4. The architecture is BNInception and I get the following error.
Traceback (most recent call last):
File “bninception.py”, line 513, in
model = bninception()
File “bninception.py”, line 504, in bninception
model.load_state_dict(model_zoo.load_url(settings[‘url’]))
File “/anaconda/lib/python3.6/site-packages/torch/nn/modules/module.py”, line 719, in load_state_dict
self.class.name, “\n\t”.join(error_msgs)))
RuntimeError: Error(s) in loading state_dict for BNInception:
size mismatch for conv1_7x7_s2_bn.weight: copying a param of torch.Size([64]) from checkpoint, where the shape is torch.Size([1, 64]) in current model.
size mismatch for conv1_7x7_s2_bn.bias: copying a param of torch.Size([64]) from checkpoint, where the shape is torch.Size([1, 64]) in current model.
size mismatch for conv1_7x7_s2_bn.running_mean: copying a param of torch.Size([64]) from checkpoint, where the shape is torch.Size([1, 64]) in current model.
size mismatch for conv1_7x7_s2_bn.running_var: copying a param of torch.Size([64]) from checkpoint, where the shape is torch.Size([1, 64]) in current model.
size mismatch for conv2_3x3_reduce_bn.weight: copying a param of torch.Size([64]) from checkpoint, where the shape is torch.Size([1, 64]) in current model.
size mismatch for conv2_3x3_reduce_bn.bias: copying a param of torch.Size([64]) from checkpoint, where the shape is torch.Size([1, 64]) in current model.

alwynmathew · July 30, 2018, 5:14am

Loading model is backward compatible but not forward compatible i.e., you can load a model from 0.4 in 0.3 but not 0.3 in 0.4 issue 6801

Jie · May 31, 2019, 3:06am

I have similar problem,but the error is a little different,my pytorch edition is 1.1.0
the error is as follows:
Traceback (most recent call last):
Traceback (most recent call last):
File “D:/Git/demo/tsn-pytorch/main.py”, line 301, in
main()
File “D:/Git/demo/tsn-pytorch/main.py”, line 35, in main
consensus_type=args.consensus_type, dropout=args.dropout, partial_bn=not args.no_partialbn)
File “D:\Git\demo\tsn-pytorch\models.py”, line 39, in init
self._prepare_base_model(base_model)
File “D:\Git\demo\tsn-pytorch\models.py”, line 96, in _prepare_base_model
self.base_model = getattr(tf_model_zoo, base_model)()
File “D:\Git\demo\tsn-pytorch\tf_model_zoo\bninception\pytorch_load.py”, line 35, in init
self.load_state_dict(torch.utils.model_zoo.load_url(weight_url))
File “D:\Anaconda\envs\pytorch\lib\site-packages\torch\nn\modules\module.py”, line 777, in load_state_dict
self.class.name, “\n\t”.join(error_msgs)))
RuntimeError: Error(s) in loading state_dict for BNInception:
size mismatch for conv1_7x7_s2_bn.weight: copying a param with shape torch.Size([1, 64]) from checkpoint, the shape in current model is torch.Size([64]).
size mismatch for conv1_7x7_s2_bn.bias: copying a param with shape torch.Size([1, 64]) from checkpoint, the shape in current model is torch.Size([64]).
size mismatch for conv1_7x7_s2_bn.running_mean: copying a param with shape torch.Size([1, 64]) from checkpoint, the shape in current model is torch.Size([64]).
size mismatch for conv1_7x7_s2_bn.running_var: copying a param with shape torch.Size([1, 64]) from checkpoint, the shape in current model is torch.Size([64]).
can you help me?Many thanks!

anti1990 · June 5, 2019, 9:25am

I have a similar problem. Can you tell me how you solved it? Thank you

mujtabaasif · June 18, 2019, 6:39am

This worked for me

def bninception(num_classes=1000, pretrained=None, path='/workspace/model/bn_inception.pth'):
    model = BNInception(num_classes=num_classes)
    if pretrained is not None:
        # First load the model weights into a dictionary
        state_dict = torch.load(path)
        # Update the weight shapes for layers by iterating the dictionary
        for name, weights in state_dict.items():
            if 'conv1_7x7_s2_bn' in name or 'conv2_3x3' in name or 'inception' in name:
                if len(weights.size()) == 2:
                    state_dict[name] = weights.view(-1)

        # Load the updated weights into a model
        model.load_state_dict(state_dict)