Adding layers to MobileNetV2

Hi all,

I’m new to the DL field and pytorch.
I want to use pretrained MobileNetV2 but instead of its classifier, add 1x1 convolutional layer+max pooling+convolutional layer+other linear layers (all this in order to reduce the output to less dimensions so I can cluster it later on).

When I tried to run the next code I got “Expected 4-dimensional input for 4-dimensional weight [1280, 1280, 2, 2], but got 2-dimensional input of size [2, 1280] instead”.

Here’s the code:
model = models.mobilenet_v2(pretrained=True)
for param in model.parameters():
param.requires_grad = False

#modify last layer of the net
fc = nn.Sequential(
nn.Conv2d(1280, 1280, kernel_size=(2, 2), stride=(2, 2), bias=False),
nn.MaxPool2d(kernel_size=2, stride=2, padding=0),
nn.Conv2d(1280, 320, kernel_size=(1, 1), stride=(1, 1), bias=False),
nn.Dropout(0.2),
nn.BatchNorm1d(320),
nn.ReLU(),
nn.Dropout(0.3),
nn.Linear(320,NUM_CLASSES),
n.LogSoftmax(dim=1)
)
model.classifier = fc
criterion = nn.NLLLoss()
optimizer = torch.optim.Adam(model.classifier.parameters(), lr=0.0015)
model.to(DEVICE)
summary(model, input_size=(3, 224, 224))

Is there someone who managed to that successfully? Or tell me where am I wrong?

Thanks a lot in advance!

The original forward implementation will flatten the activation output in this line of code before passing it to the classifier.
You could add a custom module at the beginning of your nn.Sequential container and reshape the activation back or alternatively create a custom model and override the forward method.

Once this is solved, note that your nn.Sequential module is missing an nn.Flatten layer after the last nn.Conv2d layer (and also uses nn.BatchNorm1d, which might be a typo).

Thank you very much for answering!

I wanted to implement your suggestion of reshaping the activation back but I couldn’t figure out how to go from 2-dimensional weight [2,1280] to 4-dimensional [1280, 1280, 2, 2] (I guess the 2,2 are due to the kernel and stride).

So, I tried to define the forward only on the mobilenetv2.features as written below (thought this way I will just avoid the activation function) but that resulted in the next error “Expected 4-dimensional input for 4-dimensional weight [1280, 1280, 1, 1], but got 2-dimensional input of size [2, 62720] instead”.

Here’s the current code:

class Flatten(nn.Module):
def forward(self, input):
return input.view(input.size(0), -1)

class MobileNet_ConvAdd(nn.Module):
def init(self):
super(MobileNet_ConvAdd, self).init()
self.model_M = models.mobilenet_v2(pretrained=True)
self.model_M.classifier = nn.Identity()
for param in self.model_M.parameters():
param.requires_grad = False

    self.MobileNet_ConvAdd_conv1 = nn.Sequential(
                                                nn.Flatten(),
                                                nn.Conv2d(1280, 1280, kernel_size=(2, 2), stride=(2, 2), bias=False),
                                                nn.MaxPool2d(kernel_size=2, stride=2, padding=0),
                                                nn.ReLU(),
                                                nn.Conv1d(1280, 320, kernel_size=(1, 1), stride=(1, 1), bias=False),
                                                nn.AvgPool2d(kernel_size=2, stride=2, padding=0),
                                                nn.Flatten()
                                                )

    self.MobileNet_ConvAdd_fc2 = nn.Sequential(
                                    nn.Dropout(0.2),
                                    nn.BatchNorm1d(1280), #320
                                    nn.ReLU(),
                                    nn.Dropout(0.3),      
                                    nn.Linear(320,NUM_CLASSES),
                                    nn.LogSoftmax(dim=1)
                                    )

def forward(self, x):
    x = self.model_M.features(x)
    x = self.MobileNet_ConvAdd_conv1(x)
    x = self.MobileNet_ConvAdd_fc2(x)
    return x

The first nn.Flatten() layer in self.MobileNet_ConvAdd_conv1 would flatten the incoming tensor, which will create a shape mismatch in the following nn.Conv2d.
nn.X2d layers expect an input activation of [batch_size, channels, height, width], while the nn.Linear layer expects an activation of [batch_size, in_features] (in the default setup).

Remove the first Flatten layer and make sure you really want to use nn.BatchNorm1d in the second nn.Sequential container.