Connecting another model's final layers with a pre-trained model

Hello there,

I have problems figuring out how to do the following: I have a pre-trained ResNet model with custom final layers at the end - let’s call these parts A (all the architecture up to the final layers) and B (final layers themselves). Now, I have copied that model and cut off part B so that A now ends with convolution layer’s output, I do something with that model and then, at a later point, I would like to take B part from the first model (the final layers that is), copy them and re-attach them to A part of the second model.

How would one go about doing this? I’d appreciate any ideas, thanks!

My suggestion is to print(model) then you can see the layers involved.

For example, ResNet final layer is under model.fc. You can then do this to suit it to your own final layers:

model = models.resnet18(pretrained=True)
num_features = model.fc.in_features #the number of nodes in the last layer for the ResNet model
num_classes = 20 #whatever number of classes u have.
model.fc = nn.Linear(num_features, num_classes) #connect ur own layer into it.

There is more flexibility, you just have to determine from which layer u want to make the changes. This is an example of just changing the final linear layer to connect to the number of outputs that you desired.

Thank you very much for your quick reply.

Wouldn’t nn.Linear create a new linear layer with specified num_features and num_classes? What if I have a pre-trained model whose fc I would like to use? The architecture is completely the same, I just don’t want to generate a new final layer but use one from an already existing model.

I have never done that before, my guess would be to load the saved model and assign it to model.fc. Make sure that your connecting layer takes in model.fc.in_features.

Edit: In the use case of adding just a few extra linear layers, it really does not make any sense to have “trained linear layers” connected at the end because backbone is different to begin with. I do not see any benefits of doing this.

Much appreciated, you gave me an idea I will try out and revert here if it proves successful.

So this is a solution I found to be working (please correct me if I am wrong):

class NetBase(nn.Module):

    def __init__(self, model, num_classes):
        super().__init__()

        self.features = nn.ModuleList(model.children())[:-1]
        self.fc = nn.Linear(512, num_classes)

    def forward(self, x):
        output = self.features(x)
        output = self.fc(output)

        return output


class NetMerged(nn.Module):

    def __init__(self, model1, model2):
        super().__init__()

        self.features = model1.features
        self.fc = model2.fc


    def forward(self, x):
        output = self.features(x)
        output = self.fc(output)

        return output

The idea is to load both models into an object of a new class and to redefine the forward function so that it passes through features of one model and the classifier of another.

I’ve used the num_classes parameter for testing purposes, if one would create two models with different number of output classes by combining them and printing the model it should be possible to verify that indeed the classifier of another model is used with the feature extractor of the original one:

mBase1 = NetBase(m1, 10)
mBase2 = NetBase(m2, 20)

mMerged = NetMerged(mBase1, mBase2)
print(mMerged)