MobileNet-V2 PyTorch implementation

I was trying to re-implement MobileNet-V2 according to the paper. Based on Table 2(Page 5) of the paper, after the sequence of Inverted residual blocks, there should be 2 other regular convolutional layers. But in Pytorch’s implementation, the second regular convolutional layer(last row in table 2) has been removed. What is the reason for this modification?

I am not sure, but it seems that everything is correct. Last row: conv2d 1x1 is just a classifier. You can use the convolution layer with filter 1x1 instead of the Linear layer. The output will be the same - for a mathematical point of view, they are the same operations. I suppose that authors use that kind of format in the table to keep the same conventions - without introducing a new type of layer.

See example:


number_of_classes = 1000
batch_size = 512
i = torch.rand(batch_size, 1280, 7, 7)

class NetworkA(torch.nn.Module):
    def __init__(self):
        super(NetworkA, self).__init__()
        self.linear = torch.nn.Linear(1280, number_of_classes)
        
    def forward(self, x):
        x = torch.nn.functional.adaptive_avg_pool2d(x, (1, 1)).reshape(x.shape[0], -1)
        return self.linear(x)
    
class NetworkB(torch.nn.Module):
    def __init__(self):
        super(NetworkB, self).__init__()
        self.conv = torch.nn.Conv2d(1280, number_of_classes, kernel_size=1)
        
    def forward(self, x):
        x = torch.nn.functional.adaptive_avg_pool2d(x, (1, 1))
        return self.conv(x).reshape(x.shape[0], -1)    
    
netA = NetworkA()
print(netA(i).shape)

netB = NetworkB()
print(netB(i).shape)

The result:

torch.Size([512, 1000])
torch.Size([512, 1000])