Save and Load multi model

Praveen_K · February 9, 2021, 6:27pm

I’m trying to train Autoencoder as image classifier. Here is my code for Autoencoder

class Autoencoder(nn.Module):
    def __init__(self):
        super(Autoencoder, self).__init__()
        # encoder
        self.encoder = nn.Sequential(nn.Linear(in_features=784, out_features=16), 
                                  nn.ReLU()
                                 )
        # decoder
        self.decoder = nn.Sequential(nn.Linear(in_features=16, out_features=784), 
                                     nn.ReLU(),
                                    )

   
    def forward(self, x):
        x = self.encoder(x)       
        x = self.decoder(x)

        return x

net = Autoencoder()

After training of Autoencoder I’m training classifier

# copy encoder part
new_classifier = nn.Sequential(*list(net.children())[:-1])
net = new_classifier
# add FC and Softmax
net.add_module('classifier', nn.Sequential(nn.Linear(100, x),nn.LogSoftmax(dim=1)))

Now I would like to Save and Load model in the below format.

torch.save(the_model.state_dict(), PATH)

Then later:

the_model = TheModelClass(*args, **kwargs)
the_model.load_state_dict(torch.load(PATH))

How can I load the model as above if I have 2 parts(Encoder and Classifier) ?
It is little bit confusing in Multi models save and load in documentation. It would be helpful if someone helps me with this.

Thanks

Praveen_K · February 10, 2021, 2:42pm

Can anyone pls help me with this ?

ptrblck · February 11, 2021, 7:24am

I’m not sure I understand the use case correctly.

You can directly save and load the state_dict of the Autoencoder even if it contains the two submodules.
Wrapping the model into an nn.Sequential container might work for your use case, but would change the state_dict keys.
Saving the state_dict of the nn.Sequential container and loading it into the Autoencoder will most likely not work out of the box and you would need to adapt the keys inside the state_dict.

Praveen_K · February 12, 2021, 3:53pm

I understood about adapting keys. I tried few things myself.

Here is how I’m saving and loading the model.


torch.save(
        {
            'model_state_dict': net.state_dict(),
            'optimizer_state_dict': optimizer.state_dict(),
        }, './model.pth')

Loading the model…

class Initialise(nn.Module):
    def __init__(self):
        super(Initialise, self).__init__()
        self.encoder = nn.Sequential(nn.Linear(28*28*3, 256), nn.ReLU())
        self.classifier = nn.Sequential(nn.Linear(256, 100),nn.LogSoftmax(dim=1))
  
    def forward(self, x):
        x = self.encoder(x)
        x = self.classifier(x)

    
        return x

model= Initialise()
checkpoint = torch.load('./model.pth', map_location='cuda')
model.load_state_dict(checkpoint['model_state_dict'], strict=False)

But at the moment I’m facing the below error.

_IncompatibleKeys(missing_keys=['encoder.0.weight', 'encoder.0.bias'], unexpected_keys=['0.0.weight', '0.0.bias'])

If I understand properly, I should just have to change layer name before saving from 0 to encoder , right?

ptrblck · February 13, 2021, 1:48am

Yes, the issue is raised, since you are changing the model definition from a custom module using the self.encoder and self.classifier submodules to a sequential module.
One fix would be to rename the keys in the state_dict. However, I’m currently unsure, if you really need to change the model definition or if it would be possible to use the custom model definition.
Could you explain your use case a bit more and why you are rewrapping the submodules in an nn.Sequential container?

Praveen_K · February 14, 2021, 5:09pm

My use case here is to train Autoencoder and use the model as classifier by adding FC-layer at last.

This was my question. I’m currently unsure how to load the model using custom model definition.

I tried the below code also. But it results in _IncompatibleKeys error as above. Below code is the same code used during custom model training(decoder part not included for obvious reasons).

class Initialise(nn.Module):
    def __init__(self):
        super(Initialise, self).__init__()
        self.encoder = nn.Sequential(nn.Linear(28*28*3, 256), nn.ReLU())


    def forward(self, x):
        x = self.encoder(x)
        
        return x
		
		
		
model= Initialise()
model.add_module('classifier', nn.Sequential(nn.Linear(256, 100),nn.LogSoftmax(dim=1)))

I’m using nn.Sequential in encoder and classifier for modules to be in order. and i saw removing the rewrapping wouldn’t solve the problem either

I did this and solved the problem. But its not optimum solution, as the code is not generic.

I think the problem comes when i copy the weights of encoder. Here the key name is changed when I copy to train classifier network


# copy the weights and bias of encoder
new_classifier = nn.Sequential(*list(net.children())[:-1])

net = new_classifier
# add classifier layer
net.add_module('classifier', nn.Sequential(nn.Linear(256, 100),nn.LogSoftmax(dim=1)))

I’m not sure how to copy along with the key name

ptrblck · February 15, 2021, 9:41am

This model definition:

class Initialise(nn.Module):
    def __init__(self):
        super(Initialise, self).__init__()
        self.encoder = nn.Sequential(nn.Linear(28*28*3, 256), nn.ReLU())


    def forward(self, x):
        x = self.encoder(x)
        
        return x
		
		
		
model= Initialise()
model.add_module('classifier', nn.Sequential(nn.Linear(256, 100),nn.LogSoftmax(dim=1)))

won’t use the classifier submodules, as Initialise is not using it in its forward method as can be seen here:

x = torch.randn(1, 2352)
out = model(x)
print(out.shape)
> torch.Size([1, 256])

You could use a placeholder module and replace it via:

class Initialise(nn.Module):
    def __init__(self):
        super(Initialise, self).__init__()
        self.encoder = nn.Sequential(nn.Linear(28*28*3, 256), nn.ReLU())
        self.classifier = nn.Identity()

    def forward(self, x):
        x = self.encoder(x)
        x = self.classifier(x)
        return x
		
		

model= Initialise()
model.classifier = nn.Sequential(nn.Linear(256, 100),nn.LogSoftmax(dim=1))

x = torch.randn(1, 2352)
out = model(x)
print(out.shape)
> torch.Size([1, 100])

You could also save and load the state_dict via:

torch.save(model.state_dict(), 'tmp.pt')
model = Initialise()
model.classifier = nn.Sequential(nn.Linear(256, 100),nn.LogSoftmax(dim=1))
model.load_state_dict(torch.load('tmp.pt'))

Note however, that the replacement of the nn.Identity() module would still be necessary.
Let me know, if this would work for you.

Praveen_K · February 16, 2021, 10:53am

Thank you for your reply.

The error persists in the same way. Changing the key names results in matching the keys and loading the model properly.

As i had mentioned,

While copying the encoder for further training as classifier, I’m not copying the key names and this is resulting in error. The name before copying was encoder after it will be 0. Hence 0.0.weight, 0.0.bias

P.S : Sorry for pasting an image instead of pasting the code