Load only a part of the network with pretrained weights

Hi, I am working with an 8 layers CNN. I have already trained it and I have the weights (model0.pth) . Now, I want to add two more layers to my initial network so I will have 10 layers in total. However, in the training I want to initialize the the first 8 layers with my pretrained weights of model0.pth. And initialize the rest of the layers randomly.
Can you please tell me if this can be done using Pytorch and if yes how?
Thank you in advance for your help.

1 Like

Hi,

For this, there are different approaches but personally, I would create a class on top of previously defined model which you have weights for. Then add any other layers as another sequential to the new defined model. Something like this:

pretrained = torchvision.models.alexnet(pretrained=True)
class MyAlexNet(nn.Module):
    def __init__(self, my_pretrained_model):
        super(MyAlexNet, self).__init__()
        self.pretrained = my_pretrained_model
        self.my_new_layers = nn.Sequential(nn.Linear(1000, 100),
                                           nn.ReLU(),
                                           nn.Linear(100, 2))
    
    def forward(self, x):
        x = self.pretrained(x)
        x = self.my_new_layers(x)
        return x

my_extended_model = MyAlexNet(my_pretrained_model=pretrained)
my_extended_model

# here is the structure

MyAlexNet(
  (pretrained): AlexNet(
    (features): Sequential(
      (0): Conv2d(3, 64, kernel_size=(11, 11), stride=(4, 4), padding=(2, 2))
      (1): ReLU(inplace=True)
      (2): MaxPool2d(kernel_size=3, stride=2, padding=0, dilation=1, ceil_mode=False)
      (3): Conv2d(64, 192, kernel_size=(5, 5), stride=(1, 1), padding=(2, 2))
      (4): ReLU(inplace=True)
      (5): MaxPool2d(kernel_size=3, stride=2, padding=0, dilation=1, ceil_mode=False)
      (6): Conv2d(192, 384, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
      (7): ReLU(inplace=True)
      (8): Conv2d(384, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
      (9): ReLU(inplace=True)
      (10): Conv2d(256, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
      (11): ReLU(inplace=True)
      (12): MaxPool2d(kernel_size=3, stride=2, padding=0, dilation=1, ceil_mode=False)
    )
    (avgpool): AdaptiveAvgPool2d(output_size=(6, 6))
    (classifier): Sequential(
      (0): Dropout(p=0.5, inplace=False)
      (1): Linear(in_features=9216, out_features=4096, bias=True)
      (2): ReLU(inplace=True)
      (3): Dropout(p=0.5, inplace=False)
      (4): Linear(in_features=4096, out_features=4096, bias=True)
      (5): ReLU(inplace=True)
      (6): Linear(in_features=4096, out_features=1000, bias=True)
    )
  )  # till here corresponds to AlexNet's original implementation
  (my_new_layers): Sequential(
    (0): Linear(in_features=1000, out_features=100, bias=True)
    (1): ReLU()
    (2): Linear(in_features=100, out_features=2, bias=True)
  )
)

Bests

5 Likes

Thank you very much ! that was so helpful ! :slight_smile:

Really helpful, thanks for sharing!!