How to load a model properly?

Asya · September 14, 2021, 2:55pm

I’ve trained my model and saved it as model.pt file and now I would like to use it to predict new records.

So I’m trying to do the same as my tensorflow code:

model = keras.models.load_model('path/to/location')
prediction = model.predict(my_new_record)

but when I do with pytorch:

model = torch.load('model.pt')
prediction = model(my_new_record)

I get

TypeError: ‘collections.OrderedDict’ object is not callable

AlphaBetaGamma96 · September 14, 2021, 3:02pm

How exactly are you saving the model? When to save the model it usually saved as an OrderedDict and hence it’ll be a dictionary that contains multiple objects which can be obtained via a given key. For example, I save my model like,

torch.save({'epoch':epoch,
            'loss': loss,
            'model_state_dict':model.state_dict(), 
            'optim_state_dict':optim.state_dict()}, "path/to/location.pt')

then to load I do,

state_dict = torch.load(f="path/to/location.pt", map_location=device)
epoch = state_dict['epoch']
loss = state_dict['loss']
model.load_state_dict(state_dict['model_state_dict'])
optim.load_state_dict(state_dict['optim_state_dict'])

Could you print out model from the line model = torch.load("model.pt") ?

Asya · September 14, 2021, 3:08pm

I saved model like that:
torch.save(model.state_dict(), 'model.pt')

when I print it, it’s just OrderDict with many tensors:

Sayed_Nadim · September 14, 2021, 3:35pm

Instead of doing,

model = torch.load('model.pt')

Try,

model = modelClass() # initialize your model class
model.load_state_dict(torch.load('model.pt'))

For loading and saving, refer to this link.

AlphaBetaGamma96 · September 14, 2021, 4:04pm

Your OrderedDict seems to be all the parameters of your model. Try what @Sayed_Nadim stated above pass the saved object to model.load_state_dict. That should work, if not just post the error below!

Asya · September 14, 2021, 4:06pm

I’m not sure I know what to put into modelClass
because I load pretrained vgg16 model and then tuned on other set of images
so I didn’t use modelClass

AlphaBetaGamma96 · September 14, 2021, 4:11pm

How exactly is ‘model.pt’ saved? I assume you have some model and you just do torch.save(model.state_dict(), 'model.pt')? Is that correct?

Asya · September 14, 2021, 4:13pm

yes as I wrote in reply before it’s torch.save(model.state_dict(), 'model.pt')

AlphaBetaGamma96 · September 14, 2021, 4:19pm

Could you try following what I said above?

So, save the model as,

torch.save({'model_state_dict': model.state_dict()}, "model.pt")

and try loading it back with,

state_dict = torch.load(f="model.pt", map_location=device) #device is torch.device('cpu') or torch.device("cuda")
model.load_state_dict(state_dict['model_state_dict'])

This is slightly more convoluted but it might work! If this fails, could you print out the keys of the Dictionary?

for key, value in state_dict:
  print(key)

Asya · September 14, 2021, 5:02pm

what should be a model variable here in my case? model = models.vgg16(pretrained=use_pretrained) ? if so it breaks on:

RuntimeError: Error(s) in loading state_dict for VGG:
	size mismatch for classifier.6.weight: copying a param with shape torch.Size([27, 4096]) from checkpoint, the shape in current model is torch.Size([1000, 4096]).
	size mismatch for classifier.6.bias: copying a param with shape torch.Size([27]) from checkpoint, the shape in current model is torch.Size([1000]).

when I do for key, value in state_dict: print(key)
it’s:

ValueError: too many values to unpack (expected 2)

so it just prints key for key in state_dict: print(key)
it’s:

model_state_dict

AlphaBetaGamma96 · September 14, 2021, 5:23pm

Ah my apologises, I should’ve phrased the last statement more clearly. I meant to try the for key, value in state_dict expression for your original torch.save object. It makes sense it requires model_state_dict as that’s the key we use to save the model’s state_dict!

When loading the model you’re using the right object. It seems the issue is that your state_dict has the wrong shape for the layer called classifier.6 The issue seems that your output shape should be of shape [1000, 4096] whereas yours is [27, 4096]

I assume you’re trying to use VGG16 for classification but you only have 27 classes rather than the default 1000 classes? If so, it seems you just need to change the output shape to be for 27 classes and the model loading should be fine!