Unexpected key in state_dict: "bn1.num_batches_tracked"

Hello to all!
I read forum and the only solution is to update PyTorch. How can I solve this if I can’t update PyTorch version on the server? I saved the model and then I loaded model on the server and I got this:

RuntimeError: Error(s) in loading state_dict for ResNet:
	Unexpected key(s) in state_dict: "bn1.num_batches_tracked", "layer1.0.bn1.num_batches_tracked", "layer1.0.bn2.num_batches_tracked", "layer1.1.bn1.num_batches_tracked", "layer1.1.bn2.num_batches_tracked", "layer2.0.bn1.num_batches_tracked", "layer2.0.bn2.num_batches_tracked", "layer2.0.downsample.1.num_batches_tracked", "layer2.1.bn1.num_batches_tracked", "layer2.1.bn2.num_batches_tracked", "layer3.0.bn1.num_batches_tracked", "layer3.0.bn2.num_batches_tracked", "layer3.0.downsample.1.num_batches_tracked", "layer3.1.bn1.num_batches_tracked", "layer3.1.bn2.num_batches_tracked", "layer4.0.bn1.num_batches_tracked", "layer4.0.bn2.num_batches_tracked", "layer4.0.downsample.1.num_batches_tracked", "layer4.1.bn1.num_batches_tracked", "layer4.1.bn2.num_batches_tracked". 
2 Likes

If you can’t update your PyTorch version on the server, you could try to remove these keys before loading them in your old PyTorch version:

model = nn.Sequential(
    nn.Conv2d(3, 6, 3, 1, 1),
    nn.BatchNorm2d(6),
    nn.ReLU()
)

# Save in new PyTorch version
torch.save(model.state_dict(), 'bn.pth')

# Load in old version
model = ...
state_dict = torch.load('bn.pth')
del state_dict['1.num_batches_tracked']
model.load_state_dict(state_dict)

Just adapt the code to your layer names.

I manage to solve the problem with following link How to load part of pre trained model? @apaszke post.

Actually filtering is not a best solution because server started arguing. strict=False in mode.load_state_dict() is a solution. Hope this will help someone.

5 Likes

Thank you this solved my problem

I am using PyTorch 1.0.0,But it shows that version doesn’t have the argument of strict

load_pretrain
modelB=model.load_state_dict(torch.load(url,strict=False))
TypeError: load() got an unexpected keyword argument ‘strict’

The strict argument should be passed to load_state_dict, not torch.load.

I love this! I was trying to load in a model written in Torch 0.4 from Torch 1.1 and was having issues. There were more layers in newer version -> so I printed the order dictionary of each model from both version and saw that the same Python nn.Module subclass (I am using pretrained Yolo from this repo:https://github.com/vietnguyen91/Yolo-v2-pytorch ) has 133 layers in Torch 1.1 vs 111 in Torch .4. It was adding a bunch of “num_batches_tracked” layers.

In short, using the strict=False lets you run the saved parameters from Torch .4 to Torch 1.1 using the same Python class for the Yolo network. Hope this helps someone out there.

hi @ptrblck ,

having this problem with all densenet201 models

model = pretrainedmodels.densenet201(pretrained='imagenet')

untimeError: Error(s) in loading state_dict for DenseNet:
	Missing key(s) in state_dict: "features.denseblock1.denselayer1.norm1.weight"

No matter what I do, it is the same issue. I am using torch 1.6

model = pretrainedmodels.densenet201(pretrained=‘imagenet’)
state_dict = torch.load(’./checkpoints/densenet201-5750cbb1e.pth’)

I don’t get this error with the latest torchvision version.
Is torchvision.models imported as pretrainedmodels in your code?
Also, the state_dict name looks different from the torchvision one in this line of code.

Hi @ptrblck,

good to hear from you, let me paste the complete code.

model = densenet201()
model.features[0] = nn.Conv2d(1, 64, kernel_size=(7, 7), stride=(2, 2), padding=(3, 3), bias=False)

model.features.conv0
    Conv2d(1, 64, kernel_size=(7, 7), stride=(2, 2), padding=(3, 3), bias=False)

model.classifier = nn.Linear(1920, 264);
model.classifier 
   Linear(in_features=1920, out_features=264, bias=True)


~/miniconda3/envs/torch/lib/python3.8/site-packages/torch/nn/modules/module.py in load_state_dict(self, state_dict, strict)
   1042 
   1043         if len(error_msgs) > 0:
-> 1044             raise RuntimeError('Error(s) in loading state_dict for {}:\n\t{}'.format(
   1045                                self.__class__.__name__, "\n\t".join(error_msgs)))
   1046         return _IncompatibleKeys(missing_keys, unexpected_keys)

RuntimeError: Error(s) in loading state_dict for DenseNet:
	Missing key(s) in state_dict: "features.denseblock3.denselayer37.norm1.weight", "features.denseblock3.denselayer37.norm1.bias", "features.denseblock3.denselayer37.norm1.running_mean", "features.denseblock3.denselayer37.norm1.running_var", "features.denseblock3.denselayer37.conv1.weight", "features.denseblock3.denselayer37.norm2.weight", "features.denseblock3.denselayer37.norm2.bias", "features.denseblock.....

and goes for ever to this

size mismatch for features.denseblock4.denselayer24.conv2.weight: copying a param with shape torch.Size([48, 192, 3, 3]) from checkpoint, the shape in current model is torch.Size([32, 128, 3, 3]).
	size mismatch for features.norm5.weight: copying a param with shape torch.Size([2208]) from checkpoint, the shape in current model is torch.Size([1920]).
	size mismatch for features.norm5.bias: copying a param with shape torch.Size([2208]) from checkpoint, the shape in current model is torch.Size([1920]).
	size mismatch for features.norm5.running_mean: copying a param with shape torch.Size([2208]) from checkpoint, the shape in current model is torch.Size([1920]).
	size mismatch for features.norm5.running_var: copying a param with shape torch.Size([2208]) from checkpoint, the shape in current model is torch.Size([1920]).
	size mismatch for classifier.weight: copying a param with shape torch.Size([264, 2208]) from checkpoint, the shape in current model is torch.Size([264, 1920])

I dont understand because if I have this before I change the model to take one channel

    )
    (norm5): BatchNorm2d(1920, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
  )
  (classifier): Linear(in_features=1920, out_features=1000, bias=True)
)
~~
then I just change the last layer with is the classifier 
to reduce the output to my classes with this.

model.classifier = nn.Linear(1920,264); model.classifier 

Why I am getting this error ?

size mismatch for classifier.weight: copying a param 
with shape torch.Size([264, 2208]) from checkpoint, the shape 
in current model is torch.Size([264, 1920]).

I dont know what i am doing wrong...... please help.

The funny thing is that when I do it with desenet161,
 I dont get any errors.....

If I dont add this line of code I get this error
model.classifier = nn.Linear(1920, dls.c)

size mismatch for classifier.bias: copying a param with shape torch.Size([264]) from checkpoint, the shape in current model is torch.Size([1000]).

Which line of code raises this error?
I’m able to load the pretrained model and manipulate the layers as you did:

model = models.densenet201(pretrained=True)
model.features[0] = nn.Conv2d(1, 64, kernel_size=(7, 7), stride=(2, 2), padding=(3, 3), bias=False)
model.classifier = nn.Linear(1920, 264)

out = model(torch.randn(1, 1, 224, 224))

If this error is raised during the manual loading of the state_dict, where does this state_dict come from?

Thanks @ptrblck you always here to help, is working now!

It is timm pretrained model

I would be EXTREMELY careful using strict=False and have some logic that verifies that the loading isn’t failing silently.

1 Like