How do I update old key name in state_dict to new version

kaltu · April 9, 2020, 12:09pm

I recently dug out an archived densenet121 weight from 2017.
And now I tried to resurrect the weight using PyTorch 1.4.0
It seems the naming convention had been changed due to some reasons.
When I load the same architecture under newer PyTorch version with legacy state_dict, encountered the following error

RuntimeError: Error(s) in loading state_dict for DataParallel:
	Missing key(s) in state_dict: "module.densenet121.features.denseblock1.denselayer1.norm1.weight", "module.densenet121.features.denseblock1.denselayer1.norm1.bias",  ......
	Unexpected key(s) in state_dict: "module.densenet121.features.denseblock1.denselayer1.norm.1.weight", "module.densenet121.features.denseblock1.denselayer1.norm.1.bias", ......

Clearly somehow back in time the keys were in layer.#.param format and probably due to consistency that format was changed to layer#.param

Is there any existing tool to automatically and generally update the old key names to the newer version?

ptrblck · April 10, 2020, 3:26am

I’m not aware of such a tool, but I think it should be easy to remove the unwanted indexing from all keys in the dict. Are you trying to only remove these index numbers or do you have any other mismatches?

kaltu · April 10, 2020, 12:29pm

Indeed it is easy enough to manually change keys
In my case it was done by

for key in list(state_dict.keys()):
    state_dict[key.replace('.1.', '1.'). replace('.2.', '2.')] = state_dict.pop(key)

It’s not generalizable, but if there is no one encountered this problem then it probably isn’t a big deal.

stormchase143 · May 13, 2020, 1:18am

Hello, I have the same problem. I had already trained a model, but now I have changed the names of the layers. After this above description, how do I then equate it to the model’s state_dict?
Thank you in advance for your help!

Ajinkya.Bankar · June 17, 2021, 6:08pm

Hello @ptrblck and @kaltu , I am facing similar problem. I want to change the names of two keys in the model. I tried the suggested method, but I am getting errors. My code snippet and error message are below:

class LeNet(nn.Module):
    def __init__(self, num_classes=43, input_channels=3):
        super(LeNet, self).__init__()
        self.conv1 = nn.Conv2d(input_channels, 6, 5)
        self.conv2 = nn.Conv2d(6, 16, 5)
        self.fc1   = nn.Linear(16*5*5, 120)
        self.fc2   = nn.Linear(120, 84)
        self.fc3   = nn.Linear(84, num_classes)
        if 1 == num_classes: 
            # compatible with nn.BCELoss
            self.softmax = nn.Sigmoid()
        else:    
            # compatible with nn.CrossEntropyLoss
            self.softmax = nn.LogSoftmax(dim=1)        


    def forward(self, x):
        out = F.relu(self.conv1(x))
        out = F.max_pool2d(out, 2)
        out = F.relu(self.conv2(out))
        out = F.max_pool2d(out, 2)
        out = out.view(out.size(0), -1)
        out = F.relu(self.fc1(out))
        out = F.relu(self.fc2(out))
        out = self.fc3(out)

        out = self.softmax(out)
        return out

teacher_model = LeNet() # get the model
checkpoint = torch.load('model_best.pth.tar', map_location=device)
state_dict = checkpoint['state_dict']
for key in list(state_dict.keys()):
    state_dict[key.replace("conv1.weight","features.0.weight"). replace("conv1.bias", "features.0.bias")] = state_dict.pop(key)
teacher_model.load_state_dict(checkpoint['state_dict'])

It is giving me the following error:

Error(s) in loading state_dict for LeNet:
Missing key(s) in state_dict: "conv1.weight", "conv1.bias".
Unexpected key(s) in state_dict: "features.0.weight", "features.0.bias".

Ajinkya.Bankar · June 17, 2021, 6:12pm

Hi, were you able to equate modified state_dict to the model and load it successfully? Thanks.

ptrblck · June 17, 2021, 7:18pm

Based on the error message it seems that you are replacing the needed conv1 keys with unexpected features keys. What kind of error were you seeing before trying to manipulate it?

Ajinkya.Bankar · June 17, 2021, 7:25pm

@ptrblck Thank you for the response. Initially, I had no errors and I was able to load the model which has old keys. But, to use this model in the energy calculation framework, it requires the key names as "features.0.weight", "features.0.bias" instead of "conv1.weight”, “conv1.bias” respectively. Therefore, I am trying to modify the key names and then load the model. Please help to solve the problem.

ptrblck · June 17, 2021, 7:27pm

I don’t quite understand the issue, since your manipulation is creating the mismatches.
Did you maybe forget to modify the model before using load_state_dict?

Ajinkya.Bankar · June 17, 2021, 7:32pm

@ptrblck I updated the snippet with my model’s class. Do I need to modify the layer names in model class also in order to use new key names?

ptrblck · June 17, 2021, 11:17pm

If your state_dict contains parameters stored as features, which should be loaded in the current conv layers, then yes: change the layer names in the model or alternatively the keys of the state_dict.

Based on your currently posted code your model uses the conv layers, the state_dict seems to use the same keys, you are changing them to features, and run into the error.

Ajinkya.Bankar · June 18, 2021, 12:42am

@ptrblck Your answer helped me to solve the problem. Thank you very much!

ali1z · December 11, 2021, 12:15am

Please include a complete code on how to load a “checkpoint.pth.tar” file and then use your code to change the key like you mentioned so we can use model.load_state_dict(torch.load(PATH)) for more prediction

thanks

Minxiangliu · March 17, 2022, 5:59am

Hi @ptrblck ,
Can you save me?
After I changed the architecture of the last layer, the key disappeared, I don’t know what went wrong…

self.model.conv_seg = nn.Sequential(
            nn.AdaptiveMaxPool3d(output_size=(1, 1, 1)),
            nn.Flatten(start_dim=1),
            nn.Dropout(0.1))
print(self.model.conv_seg)
print(self.model.conv_seg.state_dict().keys())

Results:

Sequential(
  (0): AdaptiveMaxPool3d(output_size=(1, 1, 1))
  (1): Flatten(start_dim=1, end_dim=-1)
  (2): Dropout(p=0.1, inplace=False)
)
odict_keys([])

ptrblck · March 17, 2022, 6:06am

None of the newly created layers contains any parameters or buffers, so the state_dict would be empty.

Minxiangliu · March 17, 2022, 6:12am

Hi, @ptrblck
How can I make the new layer generate buffers?

ptrblck · March 17, 2022, 6:14am

Neither pooling layers, nor Flatten or Dropout need any parameters or buffers, so I’m unsure how or why you would like to add buffers to them.
There is “nothing” to store in the state_dict for these layers