I wanted to change the parameter in a Maxpool2d layer of a VGG16 network from ceil_mode=False to ceil_mode=True to achieve a certain strict output shape requirement to reproduce a piece of code. Can this be done if I am using a pretrained vgg16 model from torchvision.models?
Yes, this should work okay.
The question you should ask yourself when modifying the architecture
of a pretrained model is whether the parameters (weights, etc.) of the
modified model still mean more or less the same thing as they did in the
In the case of your proposed VGG16 modification, the following points
MaxPool2ddoes not contain any trainable parameters, so no
As for the
Truewill sometimes change slightly the size of the “images”
output by the
MaxPool2dlayers. But the
pretty much agnostic about the spatial sizes of their inputs, so
their weights retain essentially their original meaning.
At the end, you have some fully-connected
expect inputs of a fixed size. But the fully-connected “classifier”
section of VGG16 is preceded by an
that outputs an “image” of spatial size 7 x 7, regardless of whether
cell_mode = Truemodifications have changed the size of
the input to the
All in all, the modified architecture will still work, and the weights of the
layers will retain their original meanings so that their original pretrained
values will still makes sense and work together properly with one another.
This remains true even though the sizes of various intermediate “images”
(sometimes called “feature maps”) will be modestly different as a result
cell_mode = True modifications.
See this illustration in which layer 23 (chosen more or less at random) of
features module has your
cell_mode = True modification
applied to it:
>>> import torch >>> torch.__version__ '1.10.2' >>> import torchvision >>> torchvision.__version__ '0.11.3' >>> >>> _ = torch.manual_seed (2022) >>> >>> vgg = torchvision.models.vgg16 (pretrained = True) Downloading: "https://download.pytorch.org/models/vgg16-397923af.pth" to C:\<path_to_.cache>\torch\hub\checkpoints\vgg16-397923af.pth 100%|███████████████████████████████████████████████████████████████████████████████| 528M/528M [00:39<00:00, 13.8MB/s] >>> >>> t = torch.randn (1, 3, 244, 244) >>> >>> vgg.features MaxPool2d(kernel_size=2, stride=2, padding=0, dilation=1, ceil_mode=False) >>> predA = vgg (t) >>> >>> vgg.features = torch.nn.MaxPool2d (2, 2, ceil_mode = True) >>> >>> vgg.features MaxPool2d(kernel_size=2, stride=2, padding=0, dilation=1, ceil_mode=True) >>> predB = vgg (t) >>> >>> predA[0, 0:10] tensor([-1.1322, 2.0340, -1.0934, -1.5980, -0.4693, 0.9064, 0.4962, 0.7050, 0.4435, 0.7932], grad_fn=<SliceBackward0>) >>> predB[0, 0:10] tensor([-0.2408, 2.8836, -1.5900, -1.4564, -0.6028, 2.5356, 0.9514, 0.7236, 1.3219, 0.4366], grad_fn=<SliceBackward0>) >>> >>> vgg.avgpool AdaptiveAvgPool2d(output_size=(7, 7))
Notice that the resulting prediction (for a random input) does change
when the network is modified, but that it still retains much of the structure
of the prediction made by the original network.