How to attach a network to part of an old network

isalirezag · June 12, 2018, 2:02am

So I read Link to understand how to extract the features from a pretrained network.
In the following I use VGG16. VGG has two Sequentials (features and classifier), so I used the following way to just keep the feature part and get rid of the classifier part. (well I still dont know clearly how to do it for resnet, but that is something that we can discuss later).

vgg16 = models.vgg16(pretrained=True)
New_VGG_Without_Classifier = nn.Sequential(*list(vgg16.features))

print New_VGG_Without_Classifier

and the output for printing is:

Sequential(
(0): Conv2d(3, 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
(1): ReLU(inplace)
(2): Conv2d(64, 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
(3): ReLU(inplace)
(4): MaxPool2d(kernel_size=2, stride=2, padding=0, dilation=1, ceil_mode=False)
(5): Conv2d(64, 128, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
(6): ReLU(inplace)
(7): Conv2d(128, 128, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
(8): ReLU(inplace)
(9): MaxPool2d(kernel_size=2, stride=2, padding=0, dilation=1, ceil_mode=False)
(10): Conv2d(128, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
(11): ReLU(inplace)
(12): Conv2d(256, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
(13): ReLU(inplace)
(14): Conv2d(256, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
(15): ReLU(inplace)
(16): MaxPool2d(kernel_size=2, stride=2, padding=0, dilation=1, ceil_mode=False)
(17): Conv2d(256, 512, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
(18): ReLU(inplace)
(19): Conv2d(512, 512, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
(20): ReLU(inplace)
(21): Conv2d(512, 512, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
(22): ReLU(inplace)
(23): MaxPool2d(kernel_size=2, stride=2, padding=0, dilation=1, ceil_mode=False)
(24): Conv2d(512, 512, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
(25): ReLU(inplace)
(26): Conv2d(512, 512, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
(27): ReLU(inplace)
(28): Conv2d(512, 512, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
(29): ReLU(inplace)
(30): MaxPool2d(kernel_size=2, stride=2, padding=0, dilation=1, ceil_mode=False)
)

I test this method by just giving an example to this new network, and it seems that it works:

OutPut = New_VGG_Without_Classifier(InputVar)

print OutPut.shape
torch.Size([1, 512, 7, 7])

Here are my questions and I appreciate if anyone can explain how i can solve them:

1) This is my main question. I designed a network (lets call in SecondNet) as follows, I want to attach it to the end of my New_VGG_Without_Classifier, How I should do that??

Here is my SecondNet network, I put the input of the SecondNet in way that is compatible with the output of New_VGG_Without_Classifier (hopefully it is right and i didint do any mistake)

class SecondNet(nn.Module):
    def __init__(self):
        super(SecondNet, self).__init__()
        self.conv1 = nn.Conv2d(512, 100, 3)
        self.pool = nn.MaxPool2d(2, 2)
        self.conv2 = nn.Conv2d(100, 16, 3)
        self.fc1 = nn.Linear(16 * 3 * 3, 12)
        self.fc2 = nn.Linear(12, 8)
        self.fc3 = nn.Linear(8, 15)

    def forward(self, x):
        x = self.pool(F.relu(self.conv1(x)))
        x = self.pool(F.relu(self.conv2(x)))
        x = x.view(-1, 16 * 3 * 3)
        x = F.relu(self.fc1(x))
        x = F.relu(self.fc2(x))
        x = self.fc3(x)
        return x

2) If I want to have the output of last 3 Conv2d layers and save them and work with them, how should i do that?

Jk749 · June 12, 2018, 6:15am

Regarding your 1st question, you can do this in a single nn.Module class.

class SecondNet(nn.Module):
    def __init__(self):
        super(SecondNet, self).__init__()
        vgg16 = models.vgg16(pretrained=True)
        child = vgg16.children()
        New_VGG_Without_Classifier = nn.Sequential(*list(child)) # you can control the children layers by slicing
        self.conv1 = nn.Conv2d(512, 100, 3)
        self.pool = nn.MaxPool2d(2, 2)
        self.conv2 = nn.Conv2d(100, 16, 3)
        self.fc1 = nn.Linear(16 * 3 * 3, 12)
        self.fc2 = nn.Linear(12, 8)
        self.fc3 = nn.Linear(8, 15)

    def forward(self,x):
        x = self.New_VGG_Without_Classifier(x)
        x = self.pool(F.relu(self.conv1(x)))
        x = self.pool(F.relu(self.conv2(x)))
        x = x.view(-1, 16 * 3 * 3)
        x = F.relu(self.fc1(x))
        x = F.relu(self.fc2(x))
        x = self.fc3(x)
        return x

Jk749 · June 12, 2018, 6:34am

Regarding your 2nd question, if I understand it correctly, you want to extract the weights/outputs from an intermediate conv layer.
This can be done by

each element in the above list will give an individual layer in the network with weights and bias from the original network, hence once you get those, you can save them as you do for a regular network.

torch.save(<model_state_dict>)

isalirezag · June 13, 2018, 1:51am

Thanks for the explaining.
Does it matter if instead of:

New_VGG_Without_Classifier = nn.Sequential(*list(child))

I use:
New_VGG_Without_Classifier = nn.Sequential(*list(vgg16.features))

Jk749 · June 13, 2018, 9:27am

I guess, both should give the same output; as both will give the list of layers in the network.
But, I’m not able to try your approach which I think might be because of version mismatch.
I currently have 0.4.0 pytorch version.