Batch Norm Example in CNNs--- Is this correct?

Nick_ishere · June 3, 2020, 4:21am

Hi! I’m trying to implement the batch normalization in CNN tasks. And I found this sample code through the Internet. But I’m not sure whether using Batch Norm in Convolutional Layer in this way is correct or not. Can anyone give me some ideas? Thank you!

The example code is shown below:

class CNN_batch(nn.Module):
    
    # Contructor
    def __init__(self, out_1=16, out_2=32, number_of_classes=10):
        super(CNN_batch, self).__init__()
        self.cnn1 = nn.Conv2d(in_channels=1, out_channels=out_1, kernel_size=5, padding=2)
        self.conv1_bn = nn.BatchNorm2d(out_1)

        self.maxpool1=nn.MaxPool2d(kernel_size=2)
        
        self.cnn2 = nn.Conv2d(in_channels=out_1, out_channels=out_2, kernel_size=5, stride=1, padding=2)
        self.conv2_bn = nn.BatchNorm2d(out_2)

        self.maxpool2=nn.MaxPool2d(kernel_size=2)
        self.fc1 = nn.Linear(out_2 * 4 * 4, number_of_classes)
        self.bn_fc1 = nn.BatchNorm1d(10)
    
    # Prediction
    def forward(self, x):
        x = self.cnn1(x)
        x=self.conv1_bn(x)
        x = torch.relu(x)
        x = self.maxpool1(x)
        x = self.cnn2(x)
        x=self.conv2_bn(x)
        x = torch.relu(x)
        x = self.maxpool2(x)
        x = x.view(x.size(0), -1)
        x = self.fc1(x)
        x=self.bn_fc1(x)
        return x

(As you can see, it adds self.conv1_bn , self.conv2_bn after the convolutional layer directly. Does this work?)

ptrblck · June 3, 2020, 8:24am

Yes, this should work. Do you see any issues or is this just a general question?

You wouldn’t see the last batch norm layer (self.bn_fc1) at the end of the model usually, but it might fit your use case. I would nevertheless compare it to a model without this layer and check, which one works better.

Nick_ishere · June 3, 2020, 2:32pm

Hi ptrblck! Thank you for your reply. No, I didn’t see any issue, and the model can work. I’m just wondering whether these steps are reasonable. Thank you a lot for the clarification.

Yes, I also think the output layer is quite strange, as the batch norm layer usually before the output layer. Thanks for the reminder, ptrblck! I’ll try it.