Convert Linear Layers (Fully Connected) with Conv2d Layers

marcogelli · January 26, 2021, 9:45pm

Hi everyone! In my CNN structure defined in the the __init__ method I have different Linear layers which are fed with feature maps that are obtained from Conv2d layers. By leaving the network structure as-is, my network is dependent on the input image size. Is there a way to comvert the linear layers to Conv2d and obtain the same results?

This is the code:

...
self.smooth1 = nn.Conv2d(512, 256, kernel_size=3, stride=1, padding=1)
self.smooth2 = nn.Conv2d(768, 256, kernel_size=3, stride=1, padding=1)
self.smooth3 = nn.Conv2d(1024, 256, kernel_size=3, stride=1, padding=1)

# Fully Connected Layers
self.fc5 = nn.Linear(256*25*25, num_classes)
self.fc4 = nn.Linear(256*50*50, num_classes)
self.fc3 = nn.Linear(256*100*100, num_classes)
self.fc2 = nn.Linear(256*200*200, num_classes)

In the Linear layers 25, 50, 100 and 200 depend on the input image size. Is there a way to make the “independent”?

ptrblck · January 27, 2021, 12:18am

You could transform the linear layer to a conv layer with a spatial size of 1x1, but the in_features of the linear layer would be translated to the in_channels of the conv layer, so you wouldn’t win anything.
The usual approach to relax the size dependency is to add adaptive pooling layers after the feature extractor (conv-pool layers) in order to create a pre-defined activation output shape, which fits the in_features of the following linear layer.