Hi everybody,
I’m really confused about Batch Normalization’s behaviour in pytorch.
Looking to theory, BN should calculate mean and variance of features in batch samples all together, for each channel.
OK, so if I have a matrix as input value (just like an image) I have 3 options :
- BatchNorm2d → as my data is 4 dimensional (N, C, H, W)
- BatchNorm1d → by flattening data (N, C, H, W) → (N, C, L)
- BatchNorm1d → if my data has onlyone channel (am i right?) i can simply change (N, C, L) → (N, L)
Then i would expect that every BN gives me the same output. But that’s not right. Every output is different.
I’m really really really confused about this! Especially in the difference between BatchNorm1d with input data in shape (N, C, L) and (N, L). Am i right by saying that the shape (N, L) is for one-channeled data?
Thanks for help!
I’m putting here my code (really simple, have a look) :
import torch
import torch.nn as nn
class BN2D(nn.Module) :
def __init__(self) :
super(BN2D, self).__init__()
# nn for mnist
# input (10, 1, 2, 2)
self.bn = nn.BatchNorm2d(1)
def forward(self, x) :
#flatten data
x = self.bn(x)
x = x.view(x.size(0), -1)
return x
class BN1D(nn.Module) :
def __init__(self) :
super(BN1D, self).__init__()
# nn for mnist
# input (10, 1, 4)
self.bn = nn.BatchNorm1d(1)
# input (10, 4)
self.bn_1 = nn.BatchNorm1d(4)
def forward(self, x) :
#flatten data
y = x.view(x.size(0), 1, -1)
y = self.bn(y)
y = y.view(y.size(0), -1)
y_1 = x.view(x.size(0), -1)
y_1 = self.bn_1(y_1)
return y, y_1
def main() :
bn1d = BN1D()
print(bn1d)
bn2d = BN2D()
print(bn2d)
x = torch.randn(10, 1, 2, 2)
out1d, out1d_1 = bn1d(x)
out2d = bn2d(x)
print(out1d)
print(out1d_1)
print(out2d)
if __name__ == '__main__':
main()