Model loading error when using torch.load

fkoca · January 31, 2023, 6:36am

Hi, I want to use a trained model, for example, this model but when ı try to loading models via torch.load()

import torch
from torchvision import transforms
import torchvision.models as models
import cv2

device = torch.device("mps")
model = torch.load("epoch_50.pth", map_location = device)
model.eval()


transform = transforms.ToTensor()
image = "exmp.jpg"
image = cv2.imread(image)
input = transform(image)
input = input.unsqueeze(0)


input.to(device)
result = model(input)
print(result)

They say me that error:

I finally figure out that this is because this model is saved according to state_dict.
After that ı try to apply this tutorial but ı faced another problem that there is no a nn model that must be saved or load with state_dict, in short, ı have no class like class TheModelClass(nn.Module), just want to use a trained model.

Maybe using backbone stated in model page to load model with state_dict would be useful, like so:

import torch
from torchvision import transforms
import torchvision.models as models
import cv2

device = torch.device("mps")
model = models.resnet50() ## adding this backbone according to model shared page
model.load_state_dict(torch.load("epoch_50.pth", map_location = device))
model.eval()


transform = transforms.ToTensor()
image = "exmp.jpg"
image = cv2.imread(image)
input = transform(image)
input = input.unsqueeze(0)


input.to(device)
result = model(input)
print(result)

but unfortunately this approach also ends up with a fancy another problem like:

This is a problem that ı cannot find any useful solution. Your reply will be appreciated, thank you so much.

thecho7 · January 31, 2023, 7:35am

model.load_state_dict(torch.load("epoch_50.pth", map_location = device)) is incorrect.

Try model.load_state_dict(torch.load("epoch_50.pth", map_location = device)['state_dict']) instead.

According to the error, torch.load("epoch_50.pth", map_location = device) is a dictionary that contains meta, state_dict, optimizer. The model parameter you need is in state_dict.

fkoca · January 31, 2023, 8:19am

Thank you for your response, this is a good point, but unfortunately, ı think the major problem is what that model must be. Because, ı just want to use a pre-trained model, not a nn model has written from scratch.
and as depicted below, backbone approach is also not working.

from torchvision import transforms
import torchvision.models as models
import cv2


device = torch.device("mps")
model = models.resnet50()
checkpoint = torch.load("epoch_50.pth", map_location = device)

model.load_state_dict(checkpoint["state_dict"])

transform = transforms.ToTensor()
image = "exmp.jpg"
image = cv2.imread(image)
input = transform(image)
input = input.unsqueeze(0)


input.to(device)
result = model(input)
print(result)

output:

RuntimeError: Error(s) in loading state_dict for ResNet:
        Missing key(s) in state_dict: "conv1.weight", "bn1.weight", "bn1.bias", "bn1.running_mean", 
...
        Unexpected key(s) in state_dict: "backbone.conv1.weight", "backbone.bn1.weight", 
...
"visibility_classifier.linear.bias", "landmark_regression.linear.weight", "landmark_regression.linear.bias".

thecho7 · January 31, 2023, 8:26am

That is because the names of each named parameters are different.
You should manually change the key values like below,
backbone.conv1.weight to conv1.weight.

Good luck:)

fkoca · January 31, 2023, 9:03am

Thanks thecho7. But this approach smells of brute force. There must be a proper way to handle this awkward problem.

knoriy · January 31, 2023, 10:47am

It looks like you are loading a different model’s checkpoint. The checkpoint (epoch_50.pth ) you have was not trained on the Torchvision’s implementation of resnet50.

models.resnet50(weights="IMAGENET1K_V2")

This will load a pre-trained model trained on IMAGENET1K