Saving parameters of a model that feeds into another

Hello,

I have a model training setup as follows:

input data > feature extraction cnn > main prediction model.

My question is; after training the model, I save the state_dict; does this save the parameters of the feature extractor neural network in addition to the the parameters of MyModel? Main model looks something like this:

class MyModel(nn.Module):
    def __init__(self, feature_extractor):
        super(MyModel, self).__init__()
        self.fc1 = nn.Linear(16, 10)
        self.fc2 = nn.Linear(10, 10)
        self.fc3 = nn.Linear(10, 4)
        self.feature_extractor = feature_extractor  #this is a CNN network that transforms the original data
        
    def forward(self, x):
        x = F.relu(self.feature_extractor(x))
        x = F.relu(self.fc1(x))
        x = F.relu(self.fc2(x))
        x = F.softmax(self.fc3(x), dim=1)
        return x
model = MyModel()
### train code here ###
th.save(model.state_dict())

If the answer is no, how can I ensure that parameters of both models are saved after training?
And also, what is a neat way to save/load parameters of both models?

Thank you.

import torch 
import torch.nn as nn 


class Extractor(nn.Module):
    def __init__(self):
        super().__init__()
        self.conv = nn.Conv2d(3, 4, 5)
        self.bn = nn.BatchNorm2d(4)
    
    def forward(self, x):
        return self.bn(self.conv)


class MyModel(nn.Module):
    def __init__(self, feature_extractor):
        super().__init__()
        self.fc1 = nn.Linear(16, 10)
        self.fc2 = nn.Linear(10, 10)
        self.fc3 = nn.Linear(10, 4)
        self.feature_extractor = feature_extractor 
        
    def forward(self, x):
        x = F.relu(self.feature_extractor(x))
        x = F.relu(self.fc1(x))
        x = F.relu(self.fc2(x))
        x = F.softmax(self.fc3(x), dim=1)
        return x


def main():
    extractor = Extractor()
    # ... train extractor 
    print(extractor.conv.weight)
    torch.save(extractor.state_dict(), "params.pth")
    
    model = MyModel(extractor)  # or MyModel(Extarctor())
    state_dict = torch.load("params.pth")
    model.feature_extractor.load_state_dict(state_dict)
    print(model.feature_extractor.conv.weight)

main()

Thanks for the response!

The extractor is not trained first. It is part of the main network, i.e. convolutions are performed on input x, and then the new x is passed to linear layers in MyModel.

I should mentioned that only the main MyModel class contains an optimizer. i.e.

class MyModel(nn.Module):
    def __init__(self, feature_extractor):
        super(MyModel, self).__init__()
        self.fc1 = nn.Linear(16, 10)
        self.fc2 = nn.Linear(10, 10)
        self.fc3 = nn.Linear(10, 4)
        self.feature_extractor = feature_extractor  #this is a CNN network that transforms the original data
        self.optimizer = optim.Adam(self.parameters(), lr=1e-3)

The feature extractor class does not contain an optimizer. I believe the backprop will still apply to the feature extractor class regardless, is my understanding correct?

Yes, I think :smiley: