A naive question about how BatchNorm2d works. Say I have an input size of

```
im = torch.rand(1000,1,2,2)
print(im.shape)
torch.Size([1000, 1, 2, 2])
```

i.e. a batch size of 1000 2x2 matrices.

Now say I initialize a BatchNorm2d layer and apply to my randomly generated tensor

```
m = nn.BatchNorm2d(1, affine=False,eps=0)
im = m(im)
```

where 1 corresponds to the number of channels, or the `num_features`

in BatchNorm2d. I would expect `torch.mean(im,0)`

to be a 2x2 matrix that is essentially 0 and `torch.var(im,0)`

to be a 2x2 matrix of 1. However, when I run the simple few lines above, I get

```
In [102]: torch.mean(im,0)
Out[102]:
tensor([[[ 0.0650, -0.0563],
[ 0.0067, -0.0154]]])
```

and

```
In [103]: torch.var(im,0)
Out[103]:
tensor([[[0.9965, 1.0232],
[1.0066, 0.9700]]])
```

Those results are not what I have expected. I’m sure I misunderstand how BatchNorm2D works, but where did I make the mistake?