During training, this layer keeps a running estimate of its computed mean and variance. The running sum is kept with a default momentum of 0.1.
During evaluation, this running mean/variance is used for normalization.
Reference: http://pytorch.org/docs/master/nn.html#torch.nn.BatchNorm1d