Transfer learning using github

tash · January 20, 2021, 1:07am

Hi,

I’m trying to slice a network in the middle and then use a fc layer to extract the feature. So, I followed these steps:

model = nn.Sequential(*list(model.children())[:-5]) # to slice the network
for param in model.parameters(): # freeze the network
param.requires_grad = False

class Flatten(nn.Module): # Flatten
def init(self):
super(Flatten, self).init()
def forward(self,x):
x = x.view(x.size(0),-1)
return x

model_new = nn.Sequential(model, XXConv(64, 64, 16, 3, use_bias=False), Flatten(), nn.Linear(64+64, 10), nn.ReLU(), nn.Dropout(0.4)) #adding a conv and a fc layer

After these steps, when I am trying the network, I am getting the below error:

"forward() takes 2 positional arguments but 3 were given "

Any help would be appreciated.

ptrblck · January 20, 2021, 2:23am

Could you post the complete stack trace as well as an executable code snippet so that we could have a look?
Based on the error you might either be passing 2 inputs to the nn.Sequential container or an intermediate layer returns 2 outputs, which are not accepted by the last layer. The latter issue might happen, if the original model was handling multiple outputs/inputs in the forward method, while the nn.Sequential container isn’t able to do so directly.

tash · January 20, 2021, 5:38pm

Hi, I am using a github repo, so I am not sure if I can post there code. Can I? This is a segmentation problem. The network is called ConvPoint.
Here is the executable code from my notebook:

from networks.network_seg import SegBig as Net
model = Net(1,10)   # input channels and number of classess

PATH = './state00_dict.pth'
model.load_state_dict(torch.load(PATH, map_location = device))

model = nn.Sequential(*list(model.children())[:-22])

for param in model.parameters():
    param.requires_grad = False

class Flatten(nn.Module):
    def __init__(self):
        super(Flatten, self).__init__()
    def forward(self,x):
        x = x.view(x.size(0),-1)
        return x

model_new = nn.Sequential(model, PtConv(64, 64, 16, 3, use_bias=False), Flatten(), nn.Linear(64+64, 10), nn.ReLU(),  nn.Dropout(0.4))
train_dir = os.path.join('./training_10')
filelist_train = [dataset for dataset in os.listdir(train_dir)]
print(f"done, {len(filelist_train)} train file")

import npm3d_seg as npm

ds = npm.PartDataset(filelist_train, train_dir,
                                training=True, block_size=8,
                                iteration_number=8*1000,
                                npoints=8192)
train_loader = torch.utils.data.DataLoader(ds, batch_size=8, shuffle=True,
                                            num_workers=4
                                            )
optimizer = torch.optim.Adam(model_new.parameters(), lr=1e-3)

for epoch in range(1000):
    model_new.train()
    train_loss = 0
    N_CLASSES = 10
    cm = np.zeros((N_CLASSES, N_CLASSES))
    t = tqdm(train_loader, ncols=100, desc="Epoch {}".format(epoch))
    for pts, features, seg in t:
        features = features.to(device)
        pts = pts.to(device)
        seg = seg.to(device)
        optimizer.zero_grad()
        outputs = model_new(features, pts)
        
        loss =  F.cross_entropy(outputs.view(-1, N_CLASSES), seg.view(-1))
        loss.backward()
        optimizer.step()

The error is:

TypeError                                 Traceback (most recent call last)
<ipython-input-36-24e2b7e6760e> in <module>
     10         seg = seg.to(device)
     11         optimizer.zero_grad()
---> 12         outputs = model_new(features, pts)
     13 
     14         loss =  F.cross_entropy(outputs.view(-1, N_CLASSES), seg.view(-1))

~/anaconda3/envs/pytorch/lib/python3.6/site-packages/torch/nn/modules/module.py in __call__(self, *input, **kwargs)
    539             result = self._slow_forward(*input, **kwargs)
    540         else:
--> 541             result = self.forward(*input, **kwargs)
    542         for hook in self._forward_hooks.values():
    543             hook_result = hook(self, input, result)

TypeError: forward() takes 2 positional arguments but 3 were given

ptrblck · January 20, 2021, 11:26pm

Thanks for the code.
model_new is an nn.Sequential container, which only accepts a single input, while you are passing two:

outputs = model_new(features, pts)

If you want to append certain layers to the original model, which might accept two inputs, one approach would be to write a new custom model, create the new layers in it, and define the forward method as needed.

tash · January 21, 2021, 12:43am

I am new into DL and pytorch framework, sorry for asking simple questions.

How to write the new custom model when I am using transfer learning? Does it mean, I won’t be able to use these lines

model = nn.Sequential(*list(model.children())[:-22])
model_new = nn.Sequential(model, PtConv(64, 64, 16, 3, use_bias=False), Flatten(), nn.Linear(64+64, 10), nn.ReLU(),  nn.Dropout(0.4))
train_dir = os.path.join('./training_10')

as both of them has nn.Sequential()?
Is there any example that I can go through to understand the concept?
I thought I had to use nn.Sequential() to add the previous model with the new layers to create a new network.
Thanks

ptrblck · January 21, 2021, 5:44am

You can use nn.Sequential to easily create new models, which use the layers in a sequential way.
Wrapping child modules from other models into an nn.Sequential container might work, but can also easily break, since e.g. all functional calls from the original forward would be missing and the order of the modules returned by model.children() defines the new nn.Sequential model.

Here is a small code example to show the shortcomings of this approach:

# default setup
class MyModelA(nn.Module):
    def __init__(self):
        super(MyModelA, self).__init__()
        self.conv = nn.Conv2d(3, 6, 3, 1, 1)
        self.fc = nn.Linear(6*24*24, 10)
        
    def forward(self, x):
        x = self.conv(x)
        x = F.relu(x)
        x = x.view(x.size(0), -1)
        x = self.fc(x)
        return x

model = MyModelA()
x = torch.randn(2, 3, 24, 24)
out = model(x)

modelA_seq = nn.Sequential(*list(model.children()))
out_seq = modelA_seq(x) # F.relu and view is missing

# define all operations as modules
class MyModelB(nn.Module):
    def __init__(self):
        super(MyModelB, self).__init__()
        self.conv = nn.Conv2d(3, 6, 3, 1, 1)
        self.relu = nn.ReLU()
        self.flatten = nn.Flatten()
        self.fc = nn.Linear(6*24*24, 10)
        
    def forward(self, x):
        x = self.conv(x)
        x = self.relu(x)
        x = self.flatten(x)
        x = self.fc(x)
        return x

model = MyModelB()
x = torch.randn(2, 3, 24, 24)
out = model(x)

modelB_seq = nn.Sequential(*list(model.children()))
out_seq = modelB_seq(x) # works
print(torch.allclose(out, out_seq))

# change order
class MyModelC(nn.Module):
    def __init__(self):
        super(MyModelC, self).__init__()
        self.fc = nn.Linear(6*24*24, 10)
        self.flatten = nn.Flatten()
        self.relu = nn.ReLU()
        self.conv = nn.Conv2d(3, 6, 3, 1, 1)        
        
    def forward(self, x):
        x = self.conv(x)
        x = self.relu(x)
        x = self.flatten(x)
        x = self.fc(x)
        return x

model = MyModelC()
x = torch.randn(2, 3, 24, 24)
out = model(x)

modelC_seq = nn.Sequential(*list(model.children()))
print(modelC_seq) # wrong order!
out_seq = modelC_seq(x) # doesn't work, since model.children() returns the wrong order

I would thus create a new custom model, pass the pretrained one into it, add the new layers, and call them appropriately in the forward. Alternatively you can derive from the pretrained model as the base class and override the forward method.