Break resnet into two parts

mkt1412 · March 8, 2019, 8:38pm

Hi, all:

I try to manipulate some intermediate features of resnet, so i have to break pre-trained model into two parts.
I feed original image into first part and then feed the output of first part directly into second part. But i meet size mismatch in last linear layer.

“RuntimeError: size mismatch, m1: [512 x 1], m2: [512 x 1000]”

modules = list(resnet18.children())[:3]
resnet_1st = nn.Sequential(*modules)
for p in resnet18.parameters():
    p.requires_grad = False

modules = list(resnet18.children())[3:]
resnet_2nd = nn.Sequential(*modules)
for p in resnet18.parameters():
    p.requires_grad = False
#print(resnet_2nd)

out_1st = resnet_1st(image)
print(out_1st.shape)
out_2nd = resnet_2nd(out_1st)
print(out_2nd.shape)

Any one know how to solve this? Thanks in advance!

ptrblck · March 8, 2019, 10:07pm

The error is thrown, since you are wrapping all modules in an nn.Sequential module, which is missing the flatten operation defined in resnet’s forward.
You could define a custom Flatten module and add it right before the last linear layer:

class Flatten(nn.Module):
    def __init__(self):
        super(Flatten, self).__init__()
        
    def forward(self, x):
        x = x.view(x.size(0), -1)
        return x

modules = list(model.children())[:3]
resnet_1st = nn.Sequential(*modules)

modules = list(model.children())[3:-1]
resnet_2nd = nn.Sequential(*[*modules, Flatten(), list(model.children())[-1]])

x = torch.randn(1, 3, 224, 224)
out_1st = resnet_1st(x)
print(out_1st.shape)
out_2nd = resnet_2nd(out_1st)
print(out_2nd.shape)

mkt1412 · March 8, 2019, 10:38pm

Thank you so much!

唐潮

CHAO TANG

cealia · November 9, 2020, 3:51pm

When splitting a predefined nn.Sequential inorder to get intermediate layer output in forward, can we split it directly like using model.submodel_name[:n] (which I found is still a nn.Sequential) instead of extracting layers from model.children and wrap them again in a nn.Sequential? Is there any issues to concern about? Thanks.

ptrblck · November 9, 2020, 10:00pm

Your approach should work:

model = nn.Sequential(
    nn.Conv2d(3, 6, 3, 1, 1),
    nn.ReLU(),
    nn.MaxPool2d(2),
    nn.Conv2d(6, 12, 3, 1, 1),
    nn.ReLU(),
    nn.Flatten(),
    nn.Linear(12*12*12, 10),
)

x = torch.randn(10, 3, 24, 24)
out_ref = model(x)

out1 = model[:5](x)
out2 = model[5:](out1)

print((out_ref - out2).abs().max())
> tensor(0., grad_fn=<MaxBackward1>)

STK · April 24, 2022, 2:50am

I tried to implement this code and working fine. But if I try to modify last layer then getting error. I am new to pytorch so not able to figure out what’s wrong. Only understood that there is some dimension mismatch due to last linear layer. Please help me to resolve the issue. The code I am running is as follows:

from torchvision import models

import torch

import torch.nn as nn

class Flatten(nn.Module):

def __init__(self):

    super(Flatten, self).__init__()

    

def forward(self, x):

    x = x.view(x.size(0), -1)

    return x

model = models.resnet152(pretrained=True)

modules = list(model.children())[:6]

resnet_1st = nn.Sequential(*modules)

modules = list(model.children())[6:-1]

resnet_2nd = nn.Sequential(*[*modules, Flatten(), list(model.children())[-1]])

model.fc = nn.Linear(model.fc.in_features, 512)

x = torch.randn(1, 3, 224, 224)

out_1st = resnet_1st(x)

print(out_1st.shape)

out_2nd = resnet_2nd(out_1st)

print(out_2nd.shape)

for param in model.parameters():

param.requires_grad = False

out = model.fc(out_2nd)

print(out.shape)

Error I am getting:
torch.Size([1, 512, 28, 28])
torch.Size([1, 1000])

RuntimeError Traceback (most recent call last)
in ()
26 for param in model.parameters():
27 param.requires_grad = False
—> 28 out = model.fc(out_2nd)
29 print(out.shape)

2 frames
/usr/local/lib/python3.7/dist-packages/torch/nn/functional.py in linear(input, weight, bias)
1846 if has_torch_function_variadic(input, weight, bias):
1847 return handle_torch_function(linear, (input, weight, bias), input, weight, bias=bias)
→ 1848 return torch._C._nn.linear(input, weight, bias)
1849
1850

RuntimeError: mat1 and mat2 shapes cannot be multiplied (1x1000 and 2048x512)

ptrblck · April 24, 2022, 6:28am

resnet_2nd already contains the last linear layer (model.fc) as you are explicitly adding it in:

resnet_2nd = nn.Sequential(*[*modules, Flatten(), list(model.children())[-1]])

after the custom Flatten() module.
Later you are replacing the model.fc layer with a new nn.Linear layer, which won’t change resnet_2nd anymore.
out_2nd is thus the output of the original last linear layer and has a shape of [batch_size, 1000].
In this line of code:

out = model.fc(out_2nd)

you are then trying to pass this output again to the newly initialized model.fc layer which will create the shape mismatch.

STK · April 24, 2022, 10:27am

Thank you for the information.

Actually I want to split the resnet as above. Also I want to reduce the output dimension to 512 in the FC layer. How to do that? I tried a lot but could not. Please help me.

ptrblck · April 24, 2022, 8:12pm

It wouldn’t make sense to reuse the last linear layer.

Unsure what you’ve tried, but replacing the layer before creating the nn.Sequential container should work:

modules = list(model.children())[6:-1]
model.fc = nn.Linear(model.fc.in_features, 512)
resnet_2nd = nn.Sequential(*[*modules, Flatten(), list(model.children())[-1]])

STK · April 29, 2022, 5:02pm

Hi, I have a situation where i have two dataloaders. i.e. Source1_loader and Source2_loader. I want to train batch of data from both of these dataloaders simultaneously on resnet_1st (here I want to use resnet_1st to learn data from both the datasets). I tried to implement it but not getting proper results.
I am feeling that network is not learning from second source. Please help me to solve the issue. My code is

model = models.resnet152(pretrained=True)
modules = list(model.children())[:7]
self.resnet_1st = nn.Sequential(*modules)

modules = list(model.children())[7:-1]
self.net1 = nn.Sequential(*[*modules, Flatten(), list(model.children())[-1]])

modules = list(model.children())[7:-1]
self.net2 = nn.Sequential(*[*modules, Flatten(), list(model.children())[-1]])
for param in model.parameters():
param.requires_grad = False

In forward part:
data_src1 = self.resnet_1st (data_src1) # Extract Features of Source1
data_src2 = self.resnet_1st (data_src2) # Extract Features of Source2

data_src1 = self.net1(data_src1)
data_src2 = self.net2(data_src2)

ptrblck · April 29, 2022, 6:51pm

You are freezing all models, since they are sharing parameters:

for param in model.parameters():
    param.requires_grad = False

for name, param in net1.named_parameters():
    print(name, param.requires_grad)
    
for name, param in net2.named_parameters():
    print(name, param.requires_grad)

will return False for net1 and net2.
Use copy.deepcopy if you want to create copies of the models.

BasedLukas · April 3, 2024, 1:59pm

I am trying to do something similar, with the intention on using the features for an OCR model. The model needs to intelligently decide what text is relevant. So I was thinking of using “high level” features as well as “low level features”. The intention is to use the resnet as an encoder, and use a transformer based model as a decoder, with cross attn on the resnet features. I am using the following code to do this:

class CustomResNet(nn.Module):
    def __init__(self, out_dim):
        super(CustomResNet, self).__init__()
        resnet = models.resnet152(weights=models.ResNet152_Weights.DEFAULT)
        
        self.first = nn.Sequential(*list(resnet.children())[:6])
        self.second = nn.Sequential(*list(resnet.children())[6:-2])
        self.out_proj = nn.Linear(501760, out_dim)
        
    def forward(self, x):
        # Forward pass to the point of interest
        intermediate = self.first(x)
        x = self.second(intermediate)

        #flatten and concat
        x = torch.flatten(x, 1)
        intermediate = torch.flatten(intermediate, 1)
        concatenated_features = torch.cat((x, intermediate), dim=1)

        out = self.out_proj(concatenated_features)
        return out

Does my intuition and implementation here make sense? Is there a known/tested method to reducing the conditionality of the features (from 501760 in my instance) to something more reasonable?

Break resnet into two parts

Error I am getting: torch.Size([1, 512, 28, 28]) torch.Size([1, 1000])

Error I am getting:
torch.Size([1, 512, 28, 28])
torch.Size([1, 1000])