Size mismatch issue which could be solved with a view while loading pre-trained models


I’m trying to implement the baseline C2D model in Non-local Neural Networks paper (Table 1). I’d like to initialize the ResNet video from the pre-trained model on ImageNet as they do in the paper:

All convolutions in Table 1 are in essence 2D kernels that process the input frame-by-frame (implemented as 1×k×k kernels). This model can be directly initialized from the ResNet weights pre-trained on ImageNet.

I do exactly as described but I get this error:

  File "model/", line 172, in c2d
    model.load_state_dict(model_zoo.load_url(model_urls['resnet50']), strict=False)
  File "/users/fguney/.local/lib/python2.7/site-packages/torch/nn/modules/", line 721, in load_state_dict
    self.__class__.__name__, "\n\t".join(error_msgs)))
RuntimeError: Error(s) in loading state_dict for ResNetVideo:
        While copying the parameter named "conv1.weight", whose dimensions in the model are torch.Size([64, 3, 1, 7, 7]) and whose dimensions in the checkpoint are torch.Size([64, 3, 7, 7]).

I understand the error but is there a way around it? I thought strict = False argument is exactly for that?

1 Like

patch your state_dict before loading. strict= is for key mismatch. even with strict=True, you can’t copy from a differently sized tensor.

What does it mean to patch state_dict?

Probably not the best solution but I just changed the line 659 in