How to train hierarchical models in PyTorch

motam79 · December 10, 2021, 7:43pm

I have a PyTorch architecture similar to this example

class MainModel(nn.Module):
    def __init__(self):
        super().__init__()
        self.sub_model_1 = SubModel1()
        self.sub_model_2 = SubModel2()

    def forward(self, features1, features2):
        target_1 = self.sub_model_1(features1)
        target_2 = self.sub_model_2(target_1, features2)
        return dict(
            target_1=target_1,
            target_2=target_2,
        )


class SubModel1(nn.Module):
    def __init__(self):
        super().__init__()
        self.layer = nn.Linear(2, 1)

    def forward(self, features1):
        return self.layer(features1)


class SubModel2(nn.Module):

    def __init__(self):
        super().__init__()
        self.layer = nn.Linear(3, 1)

    def forward(self, target_1, features2):
        concatenated_features = torch.cat([target_1, features2], dim=1)
        target_2 = self.layer(concatenated_features)
        return target_2

I would like to train the first network (e.g. SubModel1) by providing target_1_hat and once that network os trained, I would like to fix to freeze the parameters of the model and try to optimizer the model SubModel2 by providing target_2_hat

Does PyTorch or a similar library has the option of training such models?

InnovArul · December 11, 2021, 1:35am

I do not see any complications with it. It’s a design choice and you can write code to achieve that.
I.e., you can choose to train submodule1 first, then submodule2 or vice versa, it’s all upto how you write the code, use the optimizer (or) loss functions appropriately etc. Do you face any issues while doing that?