Check gamma and beta of BN

Hi,

BN is calculated according to the following formula:
https://pytorch.org/docs/master/generated/torch.nn.BatchNorm2d.html#torch.nn.BatchNorm2d

I have 2 questions about BN:
how to print the gamma and beta of BN in the code?
how are running_var and running_mean calculated in code? is there a calculation formula?

gamma and beta are assigned as weight and bias, respectively, so you could use:

print(bn.weight)
print(bn.bias)

Yes, the NOTE section in the linked docs show the update formula.

@ptrblck ,
Thank you!

how are running_mean and running_var used in code?
I mean in what case they are used?

I am experiencing the following issue:
there are 2 bn layers in 2 models respectively. the input, weight, bias are all the same, and the only difference is running_mean and running_var. in model.train() case, the output of those 2 bn layer are different. why?

In the default setup the running stats are updated during training (i.e. if the model is in train() mode) using the mentioned update formula. The input batches are normalized using the batch statistics.
During validation (i.e. if the model is in eval() mode) the running stats will be used to normalize the input.

If both layers are getting the same input and are both in train(), their output would also be equal. I would guess that the input might be different and you should compare it.

@ptrblck ,
Thank you!

Please check the following compaison. both are in train(), input, weight, bias are all identical. running_mean and running_var are different, and the output of BN is not identical.

I have the same understanding as yours, that is why I asked you those above questions. :slight_smile:

Could you post a minimal executable code snippet to reproduce this issue? The output difference is large enough that it doesn’t seem to come from the limited precision errors.

@ptrblck ,
Thank you!

I did the exmperments like this:

  1. hook function to register_forward_hook,
  2. if nn.BatchNormal2d, save data and stop saving after the BN.
    The exmperiments is based on resnet18, and the upper picture is for the first bn of resnet18.

I tried another experiment:
during model loading, I copied the right running_mean and running_var to the left model, the output of bn of the left model is identical with the right.