Discrepancy between using nn.ReLU
directly in an nn.Sequential
block vs defining the activation function in the __init__
function and then applying it to the nn.Sequential
block.
I am using the torch.nn.Sequential
to define my network as follows:
class ResidualBlock(nn.Module):
def __init__(self, in_channels, out_channels):
super(ResidualBlock, self).__init__()
self.activation = nn.ReLU()
self.block = nn.Sequential(
self.get_layer(in_channels, out_channels),
nn.Conv2d(out_channels, 2 * out_channels, kernel_size=1),
nn.BatchNorm2d(2 * out_channels)
)
def get_layer(self, in_channels, out_channels, kernel_size=1, padding=0):
return nn.Sequential(
nn.Conv2d(in_channels, out_channels, kernel_size, padding=padding),
nn.BatchNorm2d(out_channels),
self.activation
)
def forward(self, x):
return self.block(x)
On using torchsummary
package however, I get multiple ReLUs in my forward pass as shown below:
----------------------------------------------------------------
Layer (type) Output Shape Param #
================================================================
Conv2d-1 [-1, 32, 32, 32] 2,080
BatchNorm2d-2 [-1, 32, 32, 32] 64
ReLU-3 [-1, 32, 32, 32] 0
ReLU-4 [-1, 32, 32, 32] 0
Conv2d-5 [-1, 64, 32, 32] 2,112
BatchNorm2d-6 [-1, 64, 32, 32] 128
================================================================
However, this issue does not arise when I replace self.activation
in the nn.Sequential
with nn.ReLU
directly. Is my understanding of nn.Sequential
wrong?