Activation function in nn.Sequential

shashanksrikanth · June 17, 2020, 5:52am

Discrepancy between using nn.ReLU directly in an nn.Sequential block vs defining the activation function in the __init__ function and then applying it to the nn.Sequential block.

I am using the torch.nn.Sequential to define my network as follows:

class ResidualBlock(nn.Module):
    def __init__(self, in_channels, out_channels):
        super(ResidualBlock, self).__init__()
        self.activation = nn.ReLU()
        self.block = nn.Sequential(
            self.get_layer(in_channels, out_channels),
            nn.Conv2d(out_channels, 2 * out_channels, kernel_size=1),
            nn.BatchNorm2d(2 * out_channels)
        )

    def get_layer(self, in_channels, out_channels, kernel_size=1, padding=0):
        return nn.Sequential(
            nn.Conv2d(in_channels, out_channels, kernel_size, padding=padding),
            nn.BatchNorm2d(out_channels),
            self.activation    
        )

    def forward(self, x):
        return self.block(x)

On using torchsummary package however, I get multiple ReLUs in my forward pass as shown below:

----------------------------------------------------------------
        Layer (type)               Output Shape         Param #
================================================================
            Conv2d-1           [-1, 32, 32, 32]           2,080
       BatchNorm2d-2           [-1, 32, 32, 32]              64
              ReLU-3           [-1, 32, 32, 32]               0
              ReLU-4           [-1, 32, 32, 32]               0
            Conv2d-5           [-1, 64, 32, 32]           2,112
       BatchNorm2d-6           [-1, 64, 32, 32]             128
================================================================

However, this issue does not arise when I replace self.activation in the nn.Sequential with nn.ReLU directly. Is my understanding of nn.Sequential wrong?

kuonlp · June 17, 2020, 6:38am

What if self.activation = something else? This might be an issue of torchsummary. As it belongs to the same block, I would try to print the content of that block (not with torchsummary).

shashanksrikanth · June 26, 2020, 1:26pm

Seems like an issue with torchsummary. Printing the blocks gives the expected output