RuntimeError: Error(s) in loading state_dict for DataParallel: Missing key(s) in state_dict

Jane1 · July 27, 2022, 4:02am

I’m new to PyTorch, working on some image processing-related projects. I’m currently using ResNet101 and here I am facing this problem:

RuntimeError: Error(s) in loading state_dict for DataParallel:
Missing key(s) in state_dict: “module.context_encoding.stages.0.2.bn.weight”, “module.context_encoding.stages.0.2.bn.bias”, “module.context_encoding.stages
.0.2.bn.running_mean”, “module.context_encoding.stages.0.2.bn.running_var”, “module.context_encoding.stages.1.2.bn.weight”, “module.context_encoding.stages.1.2.bn.
bias”, “module.context_encoding.stages.1.2.bn.running_mean”, “module.context_encoding.stages.1.2.bn.running_var”, “module.context_encoding.stages.2.2.bn.weight”, "
module.context_encoding.stages.2.2.bn.bias", “module.context_encoding.stages.2.2.bn.running_mean”, “module.context_encoding.stages.2.2.bn.running_var”, “module.con
text_encoding.stages.3.2.bn.weight”, “module.context_encoding.stages.3.2.bn.bias”, “module.context_encoding.stages.3.2.bn.running_mean”, “module.context_encoding.s
tages.3.2.bn.running_var”, “module.context_encoding.bottleneck.1.bn.weight”, “module.context_encoding.bottleneck.1.bn.bias”, “module.context_encoding.bottleneck.1.
bn.running_mean”, “module.context_encoding.bottleneck.1.bn.running_var”, “module.edge.conv1.1.bn.weight”, “module.edge.conv1.1.bn.bias”, “module.edge.conv1.1.bn.ru
nning_mean”, “module.edge.conv1.1.bn.running_var”, “module.edge.conv2.1.bn.weight”, “module.edge.conv2.1.bn.bias”, “module.edge.conv2.1.bn.running_mean”, “module.e
dge.conv2.1.bn.running_var”, “module.edge.conv3.1.bn.weight”, “module.edge.conv3.1.bn.bias”, “module.edge.conv3.1.bn.running_mean”, “module.edge.conv3.1.bn.running
_var”, “module.decoder.conv1.1.bn.weight”, “module.decoder.conv1.1.bn.bias”, “module.decoder.conv1.1.bn.running_mean”, “module.decoder.conv1.1.bn.running_var”, “mo
dule.decoder.conv2.1.bn.weight”, “module.decoder.conv2.1.bn.bias”, “module.decoder.conv2.1.bn.running_mean”, “module.decoder.conv2.1.bn.running_var”, “module.decod
er.conv3.1.bn.weight”, “module.decoder.conv3.1.bn.bias”, “module.decoder.conv3.1.bn.running_mean”, “module.decoder.conv3.1.bn.running_var”, “module.decoder.conv3.3
.bn.weight”, “module.decoder.conv3.3.bn.bias”, “module.decoder.conv3.3.bn.running_mean”, “module.decoder.conv3.3.bn.running_var”, “module.fushion.1.bn.weight”, “module.fushion.1.bn.bias”, “module.fushion.1.bn.running_mean”, “module.fushion.1.bn.running_var”.
Unexpected key(s) in state_dict: “module.context_encoding.stages.0.2.weight”, “module.context_encoding.stages.0.2.bias”, “module.context_encoding.stages.0.
2.running_mean”, “module.context_encoding.stages.0.2.running_var”, “module.context_encoding.stages.1.2.weight”, “module.context_encoding.stages.1.2.bias”, “module.
context_encoding.stages.1.2.running_mean”, “module.context_encoding.stages.1.2.running_var”, “module.context_encoding.stages.2.2.weight”, “module.context_encoding.
stages.2.2.bias”, “module.context_encoding.stages.2.2.running_mean”, “module.context_encoding.stages.2.2.running_var”, “module.context_encoding.stages.3.2.weight”,
“module.context_encoding.stages.3.2.bias”, “module.context_encoding.stages.3.2.running_mean”, “module.context_encoding.stages.3.2.running_var”, “module.context_en
coding.bottleneck.1.weight”, “module.context_encoding.bottleneck.1.bias”, “module.context_encoding.bottleneck.1.running_mean”, “module.context_encoding.bottleneck.
1.running_var”, “module.edge.conv1.1.weight”, “module.edge.conv1.1.bias”, “module.edge.conv1.1.running_mean”, “module.edge.conv1.1.running_var”, “module.edge.conv2
.1.weight”, “module.edge.conv2.1.bias”, “module.edge.conv2.1.running_mean”, “module.edge.conv2.1.running_var”, “module.edge.conv3.1.weight”, “module.edge.conv3.1.b
ias”, “module.edge.conv3.1.running_mean”, “module.edge.conv3.1.running_var”, “module.decoder.conv1.1.weight”, “module.decoder.conv1.1.bias”, “module.decoder.conv1.
1.running_mean”, “module.decoder.conv1.1.running_var”, “module.decoder.conv2.1.weight”, “module.decoder.conv2.1.bias”, “module.decoder.conv2.1.running_mean”, “modu
le.decoder.conv2.1.running_var”, “module.decoder.conv3.1.weight”, “module.decoder.conv3.1.bias”, “module.decoder.conv3.1.running_mean”, “module.decoder.conv3.1.run
ning_var”, “module.decoder.conv3.3.weight”, “module.decoder.conv3.3.bias”, “module.decoder.conv3.3.running_mean”, “module.decoder.conv3.3.running_var”, “module.fushion.1.weight”, “module.fushion.1.bias”, “module.fushion.1.running_mean”, “module.fushion.1.running_var”.

I have only little knowledge on PyTorch. Based on what I understand from the error messages, it seems like the bn layer is not loaded. Can anyone help me with this? Thanks!

Jane1 · July 27, 2022, 4:04am

@ptrblck I’ve seen you helping out others with their problems which are similar as mine. Could you please help? Thank you very very very very very much

ptrblck · July 27, 2022, 4:19am

I guess you are trying to load a state_dict stored by a plain model into an nn.DataParallel model, as the load_state_dict method is complaining about the missing .module attributes.
If so, then try to load the state_dict into the model before wrapping it into nn.DataParallel:

model = MyModel()
model.load_state_dict(torch.load(path))
model = nn.DataParallel(model)

Jane1 · July 27, 2022, 5:41am

Thanks for replying!

I have changed my code from:
model = network(num_classes=num_classes, pretrained=None)
model = nn.DataParallel(model)
state_dict = torch.load(args.restore_weight)[‘state_dict’]
model.load_state_dict(state_dict)
model.eval()

To:
model = network(num_classes=num_classes, pretrained=None)
model = nn.DataParallel(model)
state_dict = torch.load(args.restore_weight)[‘state_dict’]
model.load_state_dict(torch.load(‘./resnet101-imagenet.pth’))[‘state_dict’]
model.load_state_dict(state_dict)
model.eval()

Could you please guide me if I’m doing it correctly?
Now the runtime error is:
RuntimeError: Error(s) in loading state_dict for DataParallel: Missing key(s) in state_dict: “module.conv1.weight”, “module.bn1.weight”, “module.bn1.bias”, “module.bn1.running_mean”,…
Unexpected key(s) in state_dict: “conv1.weight”, “bn1.running_mean”, “bn1.running_var”, “bn1.weight”, “bn1.bias”, “layer1.0.conv1.weight”, “layer1.0.bn1.running_mean”,…

Now it says the “module.” is missing…

ptrblck · July 27, 2022, 5:44am

You are still wrapping the model into nn.DataParallel before loading the state_dict and the error message is thus the same.
Swap the order: i.e. load the state_dict into the model and wrap it into nn.DataParallel afterwards.

Jane1 · July 27, 2022, 6:10am

Now my code is:
model = network(num_classes=num_classes, pretrained=None)
state_dict = model.load_state_dict(torch.load(‘./resnet101-imagenet.pth’))
model.load_state_dict(state_dict)
model = nn.DataParallel(model)
model.eval()

RuntimeError: Error(s) in loading state_dict for ResNet: Missing key(s) in state_dict: “context_encoding.stages.0.1.weight”, “context_encoding.stages.0.2.bn.weight”, “context_encoding.stages.0.2.bn.bias”, “context_encoding.stages.0.2.bn.running_mean”, “context_encoding.stages.0.2.bn.running_var”, “context_encoding.stages.1.1.weight”, “context_encoding.stages.1.2.bn.weight”, “context_encoding.stages.1.2.bn.bias”, “context_encoding.stages.1.2.bn.running_mean”, “context_encoding.stages.1.2.bn.running_var”, “context_encoding.stages.2.1.weight”, “context_encoding.stages.2.2.bn.weight”, “context_encoding.stages.2.2.bn.bias”, “context_encoding.stages.2.2.bn.running_mean”, “context_encoding.stages.2.2.bn.running_var”, “context_encoding.stages.3.1.weight”, “context_encoding.stages.3.2.bn.weight”, “context_encoding.stages.3.2.bn.bias”, “context_encoding.stages.3.2.bn.running_mean”, “context_encoding.stages.3.2.bn.running_var”, “context_encoding.bottleneck.0.weight”, “context_encoding.bottleneck.1.bn.weight”, “context_encoding.bottleneck.1.bn.bias”, “context_encoding.bottleneck.1.bn.running_mean”, “context_encoding.bottleneck.1.bn.running_var”, “edge.conv1.0.weight”, “edge.conv1.1.bn.weight”, “edge.conv1.1.bn.bias”, “edge.conv1.1.bn.running_mean”, “edge.conv1.1.bn.running_var”, “edge.conv2.0.weight”, “edge.conv2.1.bn.weight”, “edge.conv2.1.bn.bias”, “edge.conv2.1.bn.running_mean”, “edge.conv2.1.bn.running_var”, “edge.conv3.0.weight”, “edge.conv3.1.bn.weight”, “edge.conv3.1.bn.bias”, “edge.conv3.1.bn.running_mean”, “edge.conv3.1.bn.running_var”, “edge.conv4.weight”, “edge.conv4.bias”, “edge.conv5.weight”, “edge.conv5.bias”, “decoder.conv1.0.weight”, “decoder.conv1.1.bn.weight”, “decoder.conv1.1.bn.bias”, “decoder.conv1.1.bn.running_mean”, “decoder.conv1.1.bn.running_var”, “decoder.conv2.0.weight”, “decoder.conv2.1.bn.weight”, “decoder.conv2.1.bn.bias”, “decoder.conv2.1.bn.running_mean”, “decoder.conv2.1.bn.running_var”, “decoder.conv3.0.weight”, “decoder.conv3.1.bn.weight”, “decoder.conv3.1.bn.bias”, “decoder.conv3.1.bn.running_mean”, “decoder.conv3.1.bn.running_var”, “decoder.conv3.2.weight”, “decoder.conv3.3.bn.weight”, “decoder.conv3.3.bn.bias”, “decoder.conv3.3.bn.running_mean”, “decoder.conv3.3.bn.running_var”, “decoder.conv4.weight”, “decoder.conv4.bias”, “fushion.0.weight”, “fushion.1.bn.weight”, “fushion.1.bn.bias”, “fushion.1.bn.running_mean”, “fushion.1.bn.running_var”, “fushion.3.weight”, “fushion.3.bias”. Unexpected key(s) in state_dict: “fc.weight”, “fc.bias”.

Did I get you correctly for first loading the state_dict into model then only wrap it?

ptrblck · July 27, 2022, 6:15am

Yes, this was my suggestion. The error message now changed but your model is also unable to map this state_dict to the modules.
You would have to check how the internal modules in your model are defined and in particular check what the state_dict contains.
In particular, your model seems to contain model.context_encoding, model.edge, etc. which are all missing in the state_dict.

Jane1 · July 27, 2022, 6:20am

Alright, no problem. Thanks a lot for your help! I will continue debugging the code and see what is the issue. Sincere thanks! Have a nice day

ptrblck · July 27, 2022, 6:24am

Sure, let me know if you get stuck.
PS: I would start by checking the module names of the model and compare it against the keys in the state_dict. Something like this could give you more information what’s causing the mismatch:

model = models.resnet18()
state_dict = models.vgg16().state_dict()

for name, _ in model.named_modules():
    print(name)
    
for key in state_dict:
    print(key)

This would of course also create a mismatch when you are trying to load the VGG16 state_dict into a ResNet and would show that the actual module names are completely different.

Jane1 · July 29, 2022, 6:48am

Hi. With the below code:
model = network(num_classes=num_classes, pretrained=None)
state_dict = torch.load(args.restore_weight)[‘state_dict’]
model.load_state_dict(state_dict)
model = nn.DataParallel(model)
model.eval()
I face the problem of having an extra ‘module.’ in the keys for state_dict, while the ‘bn.’ in the keys is still present. So, I have thought of removing ‘module.’ with the below code:
model = network(num_classes=num_classes, pretrained=None)
state_dict = torch.load(args.restore_weight)[‘state_dict’]
from collections import OrderedDict
new_state_dict = OrderedDict()
for k, v in state_dict.items():
name = k[7:] # remove module.
new_state_dict[name] = v
model.load_state_dict(new_state_dict)
model = nn.DataParallel(model)
model.eval()
With this, now I finally have ‘module.’ removed. However, another problem I face which is the ‘bn.’ is removed as well. Do you have any idea on this? Thanks in advance.

ptrblck · July 29, 2022, 7:16am

Remove the .module string from the key conditionally if it’s found only by adding an if condition before calling: name = k[7:] so that other parameter keys won’t be changed.