Does group norm maintain an running average of mean and variance?

Does group norm maintain an running average of mean and variance ?

Looking at the code here: https://pytorch.org/docs/stable/_modules/torch/nn/modules/normalization.html

Neither group norm nor layer norm seem to maintain running averages. The description of them suggests they might: https://pytorch.org/docs/stable/nn.html?highlight=group%20norm#torch.nn.GroupNorm

“this layer uses statistics computed from input data in both training and evaluation modes”

Whether or not they are supposed to I don’t know. I don’t see running averages in the tensorflow version of group norm either: https://github.com/tensorflow/tensorflow/blob/r1.13/tensorflow/contrib/layers/python/layers/normalization.py (group_norm)

Or layer norm for that matter:

https://github.com/tensorflow/tensorflow/blob/r1.13/tensorflow/contrib/layers/python/layers/layers.py (layer_norm)

As both compute the mean and std for the batch dim, i.e the mean’s shape is (N, 1) in layer norm, tracking a running average doesn’t make sense. Who is to say something similar will be at that exact position in your validation batch?

3 Likes

I also found the doc confusing. How can I freeze running statistics temporality to use other data? Thank you!