Is this a bug in batch norm?

I went through the code of batchnorm.py and found this statement in batchnorm.py:

line 78 in pytorch 1.3.1:

        return F.batch_norm(
            input, self.running_mean, self.running_var, self.weight, self.bias,
            self.training or not self.track_running_stats,
            exponential_average_factor, self.eps)

Maybe this statement should be self.training and self.track_running_stats according to the documentation If track_running_stats is set to False, this layer then does not keep running estimates, and batch statistics are instead used during evaluation time as well.

F.batch_norm should only receive training=True if self.training=True and self.track_running_stats=True

Hi,

  • During training, you should always use the computed stats.
  • During evaluation, you should use computed stats if track_running_stats is False. Because you were not tracking stats, you cannot use them at evaluation time.

So self.training or not self.track_running_stats looks correct to me.

  1. There is some cases that in training we don’t want to track the statistics.
  2. In evaluation, if track_running_stats=False, then this statement will make training=True and the batch_norm will compute statistic as I understand. This is not correct because in evaluation we want to use the pre-computed statistic rather than compute this in evaluation.

Hi,

  1. No there is not. That’s the definition of training mode of batchnorm (at least the one we use and to the best of my knowledge, the one from the original paper): use the statistics computed on the current batch.
  2. But if track_running_stats was False, there are no pre-computed statistics to use as we never tracked them. We have to use the running one like during training.