Deepcopy not showing up in model

I’m trying to load 8 pretrained models into another model in PyTorch. The definition of the models I want to load (loaded on top of a vgg net):

self.output_layer=nn.Sequential(nn.Conv2d(512,512, kernel_size=3,padding=2,dilation = 2),
                            nn.Conv2d(512,256, kernel_size=3,padding=2,dilation = 2),
                            nn.Conv2d(256,128, kernel_size=3,padding=2,dilation = 2),
                            nn.Conv2d(128,64, kernel_size=3,padding=2,dilation = 2),
                            nn.Conv2d(64, 1,kernel_size=1))

I’m deep copying these into an array, then loading the weights from the pretrained models, and each gets used on the forward pass as an expert in a gating block.

  self.branches=[]
  for i in range(8):
        self.branches.append(copy.deepcopy(self.output_layer))

Forward pass:

    x = self.frontend(x)
    regressors=[]
    for i in range(8):
        regressors.append(self.branches[i](x)) 
    stack=torch.cat(regressors, dim=1)


final=(gate[:,:,None,None]*stack)
output=torch.sum(final, dim=1, keepdim=True)

These pretrained layers are not showing up in my model. When I run torchsummary on the model, I am only seeing the vgg and gating layers. The outlayer_layer is missing.

Without deepcopying the output_layer, I see all the layers and parameters in the torchsummary (10,798,408 with deepcopy vs. 42,067,280 without deepcopy.). However, I need deepcopy to load the weights of each model and not just create a list of references to the same output_layer with the same weights.

In the meantime as a workaround I defined 8 different self.output_layers and added them to the branches array. This seems to work out okay.

Hi,

You will need to use nn.ModuleList() for the submodules (each of your regressor) to be found properly.
You can simply replace self.branches=[] by self.branches=nn.ModuleList() and it will work as expected.