Saving a model that's composed of two models

Hi , I am wondering what is the correct way to save and restore a model that is composed of two models.
These are my models

class TunedResNet(nn.Module):
    """
    resnet with last fc layer replaced by one with output size = 10000
    """
    def __init__(self):
        super().__init__()
        weights_v2 = ResNet50_Weights.IMAGENET1K_V2
        self.resnet = resnet50(weights=weights_v2)
        fc_in_features = self.resnet.fc.in_features
        fc_out_features = NUM_CLASSES
        self.resnet.fc = nn.Linear(fc_in_features, fc_out_features)

    def forward(self, x):
        return self.resnet(x)


class GeoNet(nn.Module):
    def __init__(self):
        super().__init__()
        self.model = nn.Sequential(
            OrderedDict([
                ("fc1", nn.Linear(in_features=4, out_features=500)),
                ("relu1", nn.ReLU()),
                ("drop", nn.Dropout(p=0.2)),
                ("fc2", nn.Linear(in_features=500, out_features=1000)),
                ("relu2", nn.ReLU())]))

    def forward(self, x):
        return self.model(x)


class CombinedNet(nn.Module):
    def __init__(self, resnet, geonet):
        super().__init__()
        self.resnet = resnet
        

        # geo net
        self.geonet = geonet

        self.resnet_out_feats = self.resnet.resnet.fc.out_features
        self.geonet_out_feats = self.geonet.model.fc2.out_features

        self.fc = nn.Sequential(
            nn.Linear(in_features=self.resnet_out_feats + self.geonet_out_feats, out_features=NUM_CLASSES))

    def forward(self, x):
        image = x["image"]
        location = x["location"]
        r = self.resnet(image)
        g = self.geonet(location)
        concat_tensor = torch.cat((r.view(-1, self.resnet_out_feats), g.view(-1, self.geonet_out_feats)), dim=1)
        return self.fc(concat_tensor)

CombinedNet is composed of two models - TunedResNet and GeoNet
I read this tutorial and it saves each models state_dict separately. However, when I look at the state_dict of the combined model, I think it contains all the parameters and I don’t need to save each models state dict separately. Can someone confirm my understanding . Thanks !
This is what I see when I print the parameters in the combined model

for k in combined_model.state_dict().keys():
    print(k)
<SKIPPED> earlier layers
resnet.resnet.layer4.2.conv3.weight
resnet.resnet.layer4.2.bn3.weight
resnet.resnet.layer4.2.bn3.bias
resnet.resnet.layer4.2.bn3.running_mean
resnet.resnet.layer4.2.bn3.running_var
resnet.resnet.layer4.2.bn3.num_batches_tracked
resnet.resnet.fc.weight
resnet.resnet.fc.bias
geonet.model.fc1.weight
geonet.model.fc1.bias
geonet.model.fc2.weight
geonet.model.fc2.bias
fc.0.weight
fc.0.bias

That’s correct and the parent nn.Module will contain all registered parameters and buffer from all registered submodules. There is no different in using a custom nn.Module inside the parent or any layer from the torch.nn namespace, such as nn.Linear.

Thanks @ptrblck ! Is the advice in this tutorial for a different scenario ? Quoting

When saving a model comprised of multiple torch.nn.Modules, such as a GAN, a sequence-to-sequence model, or an ensemble of models, you must save a dictionary of each model’s state_dict

When does this apply ?

The tutorial is a bit different as is uses multiple model objects:

netA = Net()
netB = Net()

with different optimizers:

optimizerA = optim.SGD(netA.parameters(), lr=0.001, momentum=0.9)
optimizerB = optim.SGD(netB.parameters(), lr=0.001, momentum=0.9)

which are most likely used in a sequential way via:

out = netA(input)
out = netB(out)

In this case nothing “combines” the model besides the actual forward/backward pass in your training script.
However, if you would register both objects into a parent nn.Module they will be used as any module and a single state_dict can be saved.
As already mentioned, there won’t be any difference to registering any built-in layer:

class MySubmodule(nn.Module):
    def __init__(self):
        super().__init__()
        self.param = nn.Parameter(torch.randn(1, 1))
        self.subsubmodule = nn.Linear(1, 1)
        

class MyModel(nn.Module):
    def __init__(self):
        super().__init__()
        self.submodule = MySubmodule()
        self.layer = nn.Linear(1, 1)
    
model = MyModel()
print(dict(model.named_parameters()))
# {'submodule.param': Parameter containing:
# tensor([[0.1399]], requires_grad=True), 'submodule.subsubmodule.weight': Parameter containing:
# tensor([[0.2514]], requires_grad=True), 'submodule.subsubmodule.bias': Parameter containing:
# tensor([0.9140], requires_grad=True), 'layer.weight': Parameter containing:
# tensor([[0.8621]], requires_grad=True), 'layer.bias': Parameter containing:
# tensor([0.0644], requires_grad=True)}

Thanks for the explanation !