Load only a part of the network with pretrained weights

malek · July 8, 2020, 1:31pm

Hi, I am working with an 8 layers CNN. I have already trained it and I have the weights (model0.pth) . Now, I want to add two more layers to my initial network so I will have 10 layers in total. However, in the training I want to initialize the the first 8 layers with my pretrained weights of model0.pth. And initialize the rest of the layers randomly.
Can you please tell me if this can be done using Pytorch and if yes how?
Thank you in advance for your help.

Nikronic · July 8, 2020, 2:35pm

Hi,

For this, there are different approaches but personally, I would create a class on top of previously defined model which you have weights for. Then add any other layers as another sequential to the new defined model. Something like this:

pretrained = torchvision.models.alexnet(pretrained=True)
class MyAlexNet(nn.Module):
    def __init__(self, my_pretrained_model):
        super(MyAlexNet, self).__init__()
        self.pretrained = my_pretrained_model
        self.my_new_layers = nn.Sequential(nn.Linear(1000, 100),
                                           nn.ReLU(),
                                           nn.Linear(100, 2))
    
    def forward(self, x):
        x = self.pretrained(x)
        x = self.my_new_layers(x)
        return x

my_extended_model = MyAlexNet(my_pretrained_model=pretrained)
my_extended_model

# here is the structure

MyAlexNet(
  (pretrained): AlexNet(
    (features): Sequential(
      (0): Conv2d(3, 64, kernel_size=(11, 11), stride=(4, 4), padding=(2, 2))
      (1): ReLU(inplace=True)
      (2): MaxPool2d(kernel_size=3, stride=2, padding=0, dilation=1, ceil_mode=False)
      (3): Conv2d(64, 192, kernel_size=(5, 5), stride=(1, 1), padding=(2, 2))
      (4): ReLU(inplace=True)
      (5): MaxPool2d(kernel_size=3, stride=2, padding=0, dilation=1, ceil_mode=False)
      (6): Conv2d(192, 384, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
      (7): ReLU(inplace=True)
      (8): Conv2d(384, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
      (9): ReLU(inplace=True)
      (10): Conv2d(256, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
      (11): ReLU(inplace=True)
      (12): MaxPool2d(kernel_size=3, stride=2, padding=0, dilation=1, ceil_mode=False)
    )
    (avgpool): AdaptiveAvgPool2d(output_size=(6, 6))
    (classifier): Sequential(
      (0): Dropout(p=0.5, inplace=False)
      (1): Linear(in_features=9216, out_features=4096, bias=True)
      (2): ReLU(inplace=True)
      (3): Dropout(p=0.5, inplace=False)
      (4): Linear(in_features=4096, out_features=4096, bias=True)
      (5): ReLU(inplace=True)
      (6): Linear(in_features=4096, out_features=1000, bias=True)
    )
  )  # till here corresponds to AlexNet's original implementation
  (my_new_layers): Sequential(
    (0): Linear(in_features=1000, out_features=100, bias=True)
    (1): ReLU()
    (2): Linear(in_features=100, out_features=2, bias=True)
  )
)

Bests

malek · July 8, 2020, 2:52pm

Thank you very much ! that was so helpful !

Sabeeha_Mehtab · January 6, 2021, 6:54am

Really helpful, thanks for sharing!!

suresh_thommandru · October 24, 2022, 1:46pm

Sir, I would like to add new convolutional layers before to the avgpool, classifier not after the fully connected layers. will you show the way sir?