Deepcopy not showing up in model

Mgorb · September 20, 2019, 2:35am

I’m trying to load 8 pretrained models into another model in PyTorch. The definition of the models I want to load (loaded on top of a vgg net):

self.output_layer=nn.Sequential(nn.Conv2d(512,512, kernel_size=3,padding=2,dilation = 2),
                            nn.Conv2d(512,256, kernel_size=3,padding=2,dilation = 2),
                            nn.Conv2d(256,128, kernel_size=3,padding=2,dilation = 2),
                            nn.Conv2d(128,64, kernel_size=3,padding=2,dilation = 2),
                            nn.Conv2d(64, 1,kernel_size=1))

I’m deep copying these into an array, then loading the weights from the pretrained models, and each gets used on the forward pass as an expert in a gating block.

  self.branches=[]
  for i in range(8):
        self.branches.append(copy.deepcopy(self.output_layer))

Forward pass:

    x = self.frontend(x)
    regressors=[]
    for i in range(8):
        regressors.append(self.branches[i](x)) 
    stack=torch.cat(regressors, dim=1)

…
final=(gate[:,:,None,None]*stack)
output=torch.sum(final, dim=1, keepdim=True)

These pretrained layers are not showing up in my model. When I run torchsummary on the model, I am only seeing the vgg and gating layers. The outlayer_layer is missing.

Without deepcopying the output_layer, I see all the layers and parameters in the torchsummary (10,798,408 with deepcopy vs. 42,067,280 without deepcopy.). However, I need deepcopy to load the weights of each model and not just create a list of references to the same output_layer with the same weights.

In the meantime as a workaround I defined 8 different self.output_layers and added them to the branches array. This seems to work out okay.

albanD · September 20, 2019, 2:02pm

Hi,

You will need to use nn.ModuleList() for the submodules (each of your regressor) to be found properly.
You can simply replace self.branches=[] by self.branches=nn.ModuleList() and it will work as expected.