Hi. I’m trying to create a model by using sub_layers of two different models. the first model is defined in this way:
self.conv1 = nn.Conv2d(in_channels= 3,out_channels= 6,kernel_size= 5)
self.pool = nn.MaxPool2d(kernel_size=2,stride=2)
self.conv2 = nn.Conv2d(in_channels= 6, out_channels= 16, kernel_size= 5)
self.fc1 = nn.Linear(in_features=16 * 5 * 5, out_features= 128)
self.fc2 = nn.Linear(in_features= 128,out_features= 10)
def forward(self, x):
x = self.pool(F.relu(self.conv1(x)))
x = self.pool(F.relu(self.conv2(x)))
x = x.view(-1, 16 * 5 * 5)
x = F.relu(self.fc1(x))
x = F.relu(self.fc2(x))
x = F.softmax(x,dim=-1)
Then I use a fraction of an instance of this model + a regressor and train it. For the second phase, I remove the regressor and try to reimplement the model( a model with the same architecture of the model above) by using this method:
# remove the regressor from our model, return the trained partial model
basic_model = Model5Layer_128()
cnn_regressor_added_student = nn.Sequential(*list(basic_model.children())[:3])
combined_student = nn.Sequential(regressor_removed_student,*list(model_128.children())[3:])
It seems correct, but when I try to train the whole model or use torch summary, I get this error
RuntimeError: size mismatch, m1: [320 x 10], m2: [400 x 128] at /pytorch/aten/src/TH/generic/THTensorMath.cpp:41
Any help would be appreciated.
I’m not sure if
model_128 refer to
Model5Layer_128, but if so you should be careful about putting the child modules into an
nn.Sequential is used for simple models, where each operation can be expressed via an
Model5Layer_128 you are using the functional API (
F.softmax) which will not be added to the final model.
The better approach would be to create a new custom modules and define the
Thank you for your reply. How can I use a part of a pre-defined model to create a new custom module, i.e., I want to get a defined model( such as basic_model in our case), use some of its first layers and add a new layer at the end (a regressor, mainly for dimension compatibility).
Here is a code snippet showing the proposed workflow:
# load the pretrained model here (or pass as argument to __init__)
pretrained = models....
# extract layers either by assigning them to new attributes
self.features = pretrained.features
# or by using nn.Sequential, if possible
self.features = nn.Sequential(pretrained.children()[:3])
# add your custom layers here
self.classifier = ...
def forward(self, x):
# define the new forward using the pretrained modules, your custom layers
# and the fucntional API
x = self.features(x)
x = F.relu(x)
x = self.classifier(x)
In this code snippet you can see that I’m reusing some pretrained layers and could potentially wrap them even in an
nn.Sequential block, while the
forward method is defined explicitly.
Let me know, if you have more questions or get stuck.