[resolved] presumed [BUG] in BatchNorm

Hi,
I did an experiment on how batchnorm works in pytorch. I wrote following code:

import torch
import torch.nn as nn

bn = nn.BatchNorm2d(4)
bn.weight.data = torch.arange(0,4)
bn.bias.data = torch.arange(4,8)
bn.running_mean.data = torch.arange(8,12)
bn.running_var.data = torch.arange(12,16)
inp = torch.autograd.Variable(torch.randn(1,4,4,4))
out = bn(inp)

I set a breakpoint at functional.py#L501-L504 and print the running_mean and running_var . I mean as below:

def batch_norm(input, running_mean, running_var, weight=None, bias=None,
               training=False, momentum=0.1, eps=1e-5):
    import pdb
    pdb.set_trace()
    f = torch._C._functions.BatchNorm(running_mean, running_var, training, momentum, eps, torch.backends.cudnn.enabled)
    return f(input, weight, bias)

I saw a strange malfunction. Always their values were 0 and 1 respectively, regardless of any initialization (as I did above):

(Pdb) running_mean

 0
 0
 0
 0
[torch.FloatTensor of size 4]
(Pdb) running_var

 1
 1
 1
 1
[torch.FloatTensor of size 4]

Could you please check it?

If you have training as False, then it’s most likely not updating the running values, since it thinks that it is doing inference only, in which case the running mean and var are conceptually only read-only.

What’s your build version. My 0.1.12 works correctly.

bn.running_mean and bn.running_var are tensors

so as the problem in this post:

Did you receive above value in batch_norm function? I have installed pytorch using anaconda and the version is:
pytorch 0.1.12 py35_2cu80 [cuda80] soumith

What do you mean? Could you explain more?

Depending on model.eval() call, you can choose between training mode and testing mode. So here, the training flag is True. So updating these variables can be possible.

running_mean is tensor rather than Variable, which means you should not set attr data for tensor, it makes no sense.

import torch
import torch.nn as nn

bn = nn.BatchNorm2d(4)
bn.weight.data  = torch.arange(0,4)
bn.bias.data  = torch.arange(4,8)
bn.running_mean = torch.arange(8,12)
bn.running_var = torch.arange(12,16)
inp = torch.autograd.Variable(torch.randn(1,4,4,4))
out = bn(inp)
print bn.running_mean

So it causes error:
TypeError: cannot assign 'torch.FloatTensor' as parameter 'weight' (torch.nn.Parameter or None expected)

edited , see above ~

1 Like

Solved! Thank you so much