Hi, I followed the pytorch imagenet code to train a binary model, which has been saved as “model_best.pth.tar”. And I perform the following options to load the model.
vgg = models.vgg19(pretrained=False)
vgg.load_state_dict(torch.load('model_best.pth.tar'))
cnn = vgg.features
cnn = cnn.cuda()
model = nn.Sequential()
model = model.cuda()
But it gives me the error as “KeyError: 'unexpected key “arch” in state_dict”. I am not sure why “load_state_dict” cannot load the model as indicated. It appears that the original imagenet code saves “args.arch”. Thanks for the help.
Probably you are loading a checkpoint dict with different key value pairs, e.g. the state_dict, best accuracy etc.
Could you print the result of torch.load('model...')?
So the following should be the correct way to load the model.
cnn = models.vgg19()
cnn.features = torch.nn.DataParallel(cnn.features)
cnn.cuda()
checkpoint = torch.load(‘models/checkpoint.pth.tar’)
cnn.load_state_dict(checkpoint[‘state_dict’])
In this case, the loading works fine. But if I want to iterate the CNN,
for i,layer in enumerate(list(cnn)):
it gives me TypeError: ‘VGG’ object is not iterable. So I guess I am not loading the model the way I want?
Do you want to see all layers?
If so, you could iterate the .children() or .modules().
I’m answering on my mobile now, so I cannot check it, but I think you cannot iterate a model like this.
It appears that the model keys are also inconsistent.
When you use the following codes:
cnn = models.vgg19()
cnn_state_dict = cnn.state_dict()
print(cnn_state_dict)
It gives you “features.0.weight”
And if you load a previously trained model,
checkpoint = torch.load(‘models/checkpoint.pth.tar’)
print(checkpoint[‘state_dict’].keys())
Then it will give you “KeyError: ‘unexpected key “features.module.0.weight” in state_dict’”. What are the solutions for this one?
I am facing the same error, for vgg 13 and 19. These two websites talks about this problem, but I have not yet tested on it. I am not sure about if this applies to vgg 13 also. BTW, vgg11 and 16 does not have this problem.
Today I retried the code, and there is no such bug. I do not know why. My code is something like this:
model_list = ['vgg11_bn', 'vgg13_bn', 'vgg16_bn', 'vgg19_bn']
for mdl_name in model_list:
print(mdl_name)
mdl_path = os.path.join(mdl_result_path, mdl_name)
print(mdl_path)
try:
mdl_file = GetLatestFile(mdl_path)
print('{} is loaded.'.format(mdl_file))
except:
print('The result for {} does not exist.'.format(mdl_name))
continue
mdl = pretrainedmodels.__dict__[mdl_name](num_classes=1000, pretrained='imagenet')
...
mdl.load_state_dict(torch.load(mdl_file))
When there is the bug, it seems that the ‘try except’ part is not executed correctly, and I guess the mdl is still something from previous iteration (e.g. vgg11_bn). The output is as follows:
vgg13_bn
~/results/vgg13_bn
Traceback (most recent call last):
File "InterValid.py", line 119, in <module>
if __name__ == '__main__':
File "InterValid.py", line 87, in main
File "~/.conda/envs/myenv/lib/python3.6/site-packages/torch/nn/modules/module.py", line 721, in load_state_dict
self.__class__.__name__, "\n\t".join(error_msgs)))
RuntimeError: Error(s) in loading state_dict for VGG:
I change the model list to [‘vgg13_bn’], and now it works fine.
I have no idea why the try except is not executed correctly …
Hi @Fish, how did you solve this problem? I’ve encountered one somewhat similar to yours. Could u please take a look at it? And suggestion would be great. Thanks!