Missing ReLU layers in torchvision resnet50?

Using the torchvision resnet 50 model i have noticed printing out the model using the simple

print(model)

prints out:
ResNet(
(conv1): Conv2d(3, 64, kernel_size=(7, 7), stride=(2, 2), padding=(3, 3), bias=False)
(bn1): BatchNorm2d(64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(relu): ReLU(inplace)
(maxpool): MaxPool2d(kernel_size=3, stride=2, padding=1, dilation=1, ceil_mode=False)
(layer1): Sequential(
(0): Bottleneck(
(conv1): Conv2d(64, 64, kernel_size=(1, 1), stride=(1, 1), bias=False)
(bn1): BatchNorm2d(64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(conv2): Conv2d(64, 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
(bn2): BatchNorm2d(64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(conv3): Conv2d(64, 256, kernel_size=(1, 1), stride=(1, 1), bias=False)
(bn3): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(relu): ReLU(inplace)
(downsample): Sequential(
(0): Conv2d(64, 256, kernel_size=(1, 1), stride=(1, 1), bias=False)
(1): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
)
)
You can see there are no RelU layers in between the layers of bottleneck. whereas if you at the code for forward pass of bottleneck module

        residual = x

        out = self.conv1(x)
        out = self.bn1(out)
        out = self.relu(out)

        out = self.conv2(out)
        out = self.bn2(out)
        out = self.relu(out)

        out = self.conv3(out)
        out = self.bn3(out)

        if self.downsample is not None:
            residual = self.downsample(x)

        out += residual
        out = self.relu(out)

        return out

You can see there are reLU layers in between the Conv2d the layers. Why are these ReLU layers not showing up in the model structure, only the last ReLU layer is printed out.

If you look closely at the forward function, there is only one ReLU layer (self.relu). But it is used three times. This can be done, because ReLU layers do not learn any parameters as opposed to convolution or batch normalization layers.

To answer your question: Using print only prints the module in question and all of its submodules (which will print their submodules etc.). It does not print the “structure” or the forward function. And the modules might even be printed in another order as they are used in the forward pass (depending on the order of definition in the module constructor).

2 Likes

quick follow up question. How would i print the model structure?

There is no builtin solution for that. If you think about it, it is a hard question: How would you print skip connections? Or how would you handle simple operations or function calls within the forward function (e. g. +, torch.cat, or torch.mean)? What about recurrent networks?

This could help to visualize: