Flatten Layer VGG16

ta0 · June 25, 2018, 4:23pm

Dear Community

I would like to extract the feature representations from specific layers of the pretrained VGG16 network.

This is the model architecture

VGG(
  (features): Sequential(
    (0): Conv2d(3, 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
    (1): ReLU(inplace)
    (2): Conv2d(64, 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
    (3): ReLU(inplace)
    (4): MaxPool2d(kernel_size=2, stride=2, padding=0, dilation=1, ceil_mode=False)
    (5): Conv2d(64, 128, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
    (6): ReLU(inplace)
    (7): Conv2d(128, 128, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
    (8): ReLU(inplace)
    (9): MaxPool2d(kernel_size=2, stride=2, padding=0, dilation=1, ceil_mode=False)
    (10): Conv2d(128, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
    (11): ReLU(inplace)
    (12): Conv2d(256, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
    (13): ReLU(inplace)
    (14): Conv2d(256, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
    (15): ReLU(inplace)
    (16): MaxPool2d(kernel_size=2, stride=2, padding=0, dilation=1, ceil_mode=False)
    (17): Conv2d(256, 512, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
    (18): ReLU(inplace)
    (19): Conv2d(512, 512, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
    (20): ReLU(inplace)
    (21): Conv2d(512, 512, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
    (22): ReLU(inplace)
    (23): MaxPool2d(kernel_size=2, stride=2, padding=0, dilation=1, ceil_mode=False)
    (24): Conv2d(512, 512, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
    (25): ReLU(inplace)
    (26): Conv2d(512, 512, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
    (27): ReLU(inplace)
    (28): Conv2d(512, 512, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
    (29): ReLU(inplace)
    (30): MaxPool2d(kernel_size=2, stride=2, padding=0, dilation=1, ceil_mode=False)
  )
  (classifier): Sequential(
    (0): Linear(in_features=25088, out_features=4096, bias=True)
    (1): ReLU(inplace)
    (2): Dropout(p=0.5)
    (3): Linear(in_features=4096, out_features=4096, bias=True)
    (4): ReLU(inplace)
    (5): Dropout(p=0.5)
    (6): Linear(in_features=4096, out_features=1000, bias=True)
  )
)

This is the code that I have

class VGG16FeatureExtractor(nn.Module):
    def __init__(self, vgg_model, extract_features_layers, extract_classifier_layers):
        super(VGG16FeatureExtractor, self).__init__()
        self.submodule_features = vgg16_model._modules['features']
        self.submodule_classifier = vgg16_model._modules['classifier']
        self.extract_features_layers = extract_features_layers
        self.extract_classifier_layers = extract_classifier_layers

    def forward(self, x):
        featuremaps = []
        for id, module in self.submodule_features._modules.items():
            x = module(x)
            if int(id) in self.extract_features_layers:
                featuremaps.append(x)
        for id, module in self.submodule_classifier._modules.items():
            x = module(x)
            if int(id) in self.extract_classifier_layers:
                featuremaps.append(x)
        return featuremaps

Now I get the following error

RuntimeError: size mismatch, m1: [3584 x 7], m2: [25088 x 4096] at c:\programdata\miniconda3\conda-bld\pytorch_1524546371102\work\aten\src\thc\generic/THCTensorMathBlas.cu:249

The block size of the dataloader is set to 1.
The VGG16 model has two sequential submodules features and classifier.
I noticed that the flatten layer between the two submodules seems to be missing.
What am I missing?

Thanks for any help

ta0 · June 25, 2018, 4:31pm

I a add x = x.view(x.numel()) to flatten the intermediate feature representation before the classifier part, then the dimensions match but only for dataloader batch size = 1.

    def forward(self, x):
        featuremaps = []
        for id, module in self.submodule_features._modules.items():
            x = module(x)
            if int(id) in self.extract_features_layers:
                featuremaps.append(x)
        x = x.view(x.numel())
        for id, module in self.submodule_classifier._modules.items():
            x = module(x)
            if int(id) in self.extract_classifier_layers:
                featuremaps.append(x)
        return featuremaps

ta0 · June 25, 2018, 5:33pm

I believe x = x.view(x.size(0), -1) is doing the trick.

Maybe you can comment on that.

Thanks