Pytorch transfer learning problem

Hi I want to train this network and then add 2 convolutional layer after conv layer 5. I trained this network and code another network with same architecture but with 2 more conv layer and at first I copy the weights of similar layers to the new network and then freeze the same layers but new network does not work at all!
I initialized the 2 new conv layers that be the same as identity( model2.conv6[0].weight.data=torch.zeros((1,1,7,7)) model2.conv6[0].weight.data[0,0,3,3]=1 model2.conv6[0].bias.data=torch.zeros((1)) model2.conv7[0].weight.data=torch.zeros((1,1,3,3)) model2.conv7[0].weight.data[0,0,1,1]=1 model2.conv7[0].bias.data=torch.zeros((1)))
the network:

class Net(nn.Module):
    def __init__(self,SR,block_size,phi):
        super(MHCSResNet,self).__init__()
        self.conv1= nn.Sequential(
            nn.Conv2d(1,64, kernel_size=(9,9), stride=1,padding=4),
            nn.BatchNorm2d(64),
            nn.ReLU()
        )
        self.conv2= nn.Sequential(
            nn.Conv2d(64,32, kernel_size=(7,7), stride=1,padding=3),
            nn.BatchNorm2d(32),
            nn.ReLU()
        )
        self.conv3= nn.Sequential(
            nn.Conv2d(32,16, kernel_size=(5,5), stride=1,padding=2),
            nn.BatchNorm2d(16),
            nn.ReLU()
        )
        self.conv4= nn.Sequential(
            nn.Conv2d(16, 8, kernel_size=(3, 3), stride=1,padding=1),
            nn.BatchNorm2d(8),
            nn.ReLU()
        )
        self.conv5= nn.Sequential(
            nn.Conv2d(8,1,kernel_size=(1,1),stride=1,padding=0),
            
        )
###########
        #I want to add two conv layer here:
  
        ############
        self.fc=nn.Linear(1600,64)

    def forward(self,kr,y,phi):
        out_conv1=self.conv1(kr)
        out_conv2=self.conv2(out_conv1)
        out_conv3=self.conv3(out_conv2)
        out_conv4=self.conv4(out_conv3)
        out_conv5=self.conv5(out_conv4)
        ###########
        #I want to add two conv layer here:
  
        ############
        out_feedback=kr+out_conv5
        out_linear=self.fc(out_feedback.flatten(2))

        return out_linear

What do you mean by this?
Can you try experimenting with just the default initializations?

Hi, thanks or your reply
I thought if I initialize the new layers in way that be the same as identity then first epoch be the same as first network and output be ok, but the new network’s output is not the same as the first network’s output in last epoch.
on the other hand, the new network takes much more time or each epoch (number of parameters in first network near 20000 and new network parameters near60)
and new network’s out put is not good at all. its output is a gray picture and in each epoch change a little.