Question on batch normalization

ZeweiChu · March 28, 2017, 7:22pm

I have three convolution layer:

self.conv3 = nn.Conv1d(self.input_encoding_size, 100, kernel_size=3)
self.conv4 = nn.Conv1d(self.input_encoding_size, 100, kernel_size=4)
self.conv5 = nn.Conv1d(self.input_encoding_size, 100, kernel_size=5)

I want to apply batch normalization after the conv layer, do I need three separate batch normalization layers or just a single one in this case?

self.bn3 = nn.Conv1d(100)
self.bn4 = nn.Conv1d(100)
self.bn5 = nn.Conv1d(100)
x3 = F.relu(self.bn3(self.conv3(seq_vec)))
x4 = F.relu(self.bn4(self.conv4(seq_vec)))
x5 = F.relu(self.bn5(self.conv5(seq_vec)))

Is this the same as?

self.bn3 = nn.Conv1d(100)
x3 = F.relu(self.bn3(self.conv3(seq_vec)))
x4 = F.relu(self.bn3(self.conv4(seq_vec)))
x5 = F.relu(self.bn3(self.conv5(seq_vec)))

smth · March 28, 2017, 7:30pm

You need three separate BatchNorm layers, because BatchNorm also keeps track of running_mean and running_var, batch statistics which are used at test time. These statistics will be quite different at x3. x4 and x5.