Multi-module forward error

Aryan_Asadian · August 28, 2020, 4:23pm

Hi. I’m trying to create a model by using sub_layers of two different models. the first model is defined in this way:

class Model5Layer_128(nn.Module):
    def __init__(self):
        super(Model5Layer_128, self).__init__()
        self.conv1 = nn.Conv2d(in_channels= 3,out_channels= 6,kernel_size= 5)
        self.pool = nn.MaxPool2d(kernel_size=2,stride=2)
        self.conv2 = nn.Conv2d(in_channels= 6, out_channels= 16, kernel_size= 5)
        self.fc1 = nn.Linear(in_features=16 * 5 * 5, out_features= 128)
        self.fc2 = nn.Linear(in_features= 128,out_features= 10)

    def forward(self, x):
        x = self.pool(F.relu(self.conv1(x)))
        x = self.pool(F.relu(self.conv2(x)))
        x = x.view(-1, 16 * 5 * 5)
        x = F.relu(self.fc1(x))
        x = F.relu(self.fc2(x))
        x = F.softmax(x,dim=-1)
        return x

Then I use a fraction of an instance of this model + a regressor and train it. For the second phase, I remove the regressor and try to reimplement the model( a model with the same architecture of the model above) by using this method:

# remove the regressor from our model, return the trained partial model
def get_cnn_regressor_removed_partial_student():
    basic_model = Model5Layer_128()
    cnn_regressor_added_student = nn.Sequential(*list(basic_model.children())[:3])
    return cnn_regressor_added_student

combined_student = nn.Sequential(regressor_removed_student,*list(model_128.children())[3:])
print(combined_student)

It seems correct, but when I try to train the whole model or use torch summary, I get this error

RuntimeError: size mismatch, m1: [320 x 10], m2: [400 x 128] at /pytorch/aten/src/TH/generic/THTensorMath.cpp:41

Any help would be appreciated.

ptrblck · August 30, 2020, 10:00am

I’m not sure if basic_model and model_128 refer to Model5Layer_128, but if so you should be careful about putting the child modules into an nn.Sequential block.
nn.Sequential is used for simple models, where each operation can be expressed via an nn.Module.
In your Model5Layer_128 you are using the functional API (F.relu, x.view, and F.softmax) which will not be added to the final model.

The better approach would be to create a new custom modules and define the forward properly.

Aryan_Asadian · August 30, 2020, 11:30am

Thank you for your reply. How can I use a part of a pre-defined model to create a new custom module, i.e., I want to get a defined model( such as basic_model in our case), use some of its first layers and add a new layer at the end (a regressor, mainly for dimension compatibility).

ptrblck · August 30, 2020, 10:50pm

Here is a code snippet showing the proposed workflow:

class MyModel(nn.Module):
    def __init__(self):
        super(MyModel, self).__init__()
        
        # load the pretrained model here (or pass as argument to __init__)
        pretrained = models....
        
        # extract layers either by assigning them to new attributes
        self.features = pretrained.features
        # or by using nn.Sequential, if possible
        self.features = nn.Sequential(pretrained.children()[:3])
        
        # add your custom layers here
        self.classifier = ...
        
        
    def forward(self, x):
        # define the new forward using the pretrained modules, your custom layers
        # and the fucntional API
        x = self.features(x)
        x = F.relu(x)
        x = self.classifier(x)
        return x

In this code snippet you can see that I’m reusing some pretrained layers and could potentially wrap them even in an nn.Sequential block, while the forward method is defined explicitly.

Let me know, if you have more questions or get stuck.