Removing padding from features extracted from pre-trained VGG16

I’m using a pre-trained vgg16 as the backbone for a U-Net segmentation model. The code i have is working fine, but I was wondering if there is a way of removing the padding from the feature layers?

In the source code for the pre-trained vgg16 model it seems that the padding is set to 1 https://pytorch.org/docs/stable/_modules/torchvision/models/vgg.html#vgg16 . So i would like to know if there is a way of altering this after importing the pretrained model?

For example in the code:

import torchvision.models as models

vgg16 = models.vgg16()
conv1 = vgg16.features[0]

print(conv1)

Conv2d(3, 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))

I would like to remove the padding, to reduce the effect of artifacts at the edges.

Thanks in advance for any advice.

You could just set the new padding values directly in the module.
While setting them to e.g. 0 would work from the technical point of view, I’m not sure how the model performance would suffer from it.
Here is a small example:

vgg16 = models.vgg16()
x = torch.randn(2, 3, 224, 224)

for module in vgg16.features.modules():
    if isinstance(module, nn.Conv2d):
        module.register_forward_hook(lambda m, input, output: print(output.shape))

# Forward pass with shape outputs
output = vgg16(x)
> torch.Size([2, 64, 224, 224])
torch.Size([2, 64, 224, 224])
torch.Size([2, 128, 112, 112])
torch.Size([2, 128, 112, 112])
torch.Size([2, 256, 56, 56])
torch.Size([2, 256, 56, 56])
torch.Size([2, 256, 56, 56])
torch.Size([2, 512, 28, 28])
torch.Size([2, 512, 28, 28])
torch.Size([2, 512, 28, 28])
torch.Size([2, 512, 14, 14])
torch.Size([2, 512, 14, 14])
torch.Size([2, 512, 14, 14])


for module in vgg16.features.modules():
    if isinstance(module, nn.Conv2d):
        module.padding = (0, 0)

# Forward pass with smaller shapes        
output = vgg16(x)
> torch.Size([2, 64, 222, 222])
torch.Size([2, 64, 220, 220])
torch.Size([2, 128, 108, 108])
torch.Size([2, 128, 106, 106])
torch.Size([2, 256, 51, 51])
torch.Size([2, 256, 49, 49])
torch.Size([2, 256, 47, 47])
torch.Size([2, 512, 21, 21])
torch.Size([2, 512, 19, 19])
torch.Size([2, 512, 17, 17])
torch.Size([2, 512, 6, 6])
torch.Size([2, 512, 4, 4])
torch.Size([2, 512, 2, 2])

Since the last output activation before the nn.AdaptiveAvgPool2d layer is smaller than the kernel size (line of code), the values would be repeated a lot:


x = torch.randn(1, 1, 2, 2)
avgpool = nn.AdaptiveAvgPool2d((7, 7))
out = avgpool(x)

print(x)
> tensor([[[[-0.7769,  0.0930],
          [ 1.1264, -0.6808]]]])
print(out)
> tensor([[[[-0.7769, -0.7769, -0.7769, -0.3419,  0.0930,  0.0930,  0.0930],
          [-0.7769, -0.7769, -0.7769, -0.3419,  0.0930,  0.0930,  0.0930],
          [-0.7769, -0.7769, -0.7769, -0.3419,  0.0930,  0.0930,  0.0930],
          [ 0.1747,  0.1747,  0.1747, -0.0596, -0.2939, -0.2939, -0.2939],
          [ 1.1264,  1.1264,  1.1264,  0.2228, -0.6808, -0.6808, -0.6808],
          [ 1.1264,  1.1264,  1.1264,  0.2228, -0.6808, -0.6808, -0.6808],
          [ 1.1264,  1.1264,  1.1264,  0.2228, -0.6808, -0.6808, -0.6808]]]])

However, of course you don’t need to set all padding arguments to zeros and thus might avoid this issue. :wink:

1 Like

Brilliant - thanks very much!