[resolved] presumed [BUG] in BatchNorm

mderakhshani · June 5, 2017, 5:37pm

Hi,
I did an experiment on how batchnorm works in pytorch. I wrote following code:

import torch
import torch.nn as nn

bn = nn.BatchNorm2d(4)
bn.weight.data = torch.arange(0,4)
bn.bias.data = torch.arange(4,8)
bn.running_mean.data = torch.arange(8,12)
bn.running_var.data = torch.arange(12,16)
inp = torch.autograd.Variable(torch.randn(1,4,4,4))
out = bn(inp)

I set a breakpoint at functional.py#L501-L504 and print the running_mean and running_var . I mean as below:

def batch_norm(input, running_mean, running_var, weight=None, bias=None,
               training=False, momentum=0.1, eps=1e-5):
    import pdb
    pdb.set_trace()
    f = torch._C._functions.BatchNorm(running_mean, running_var, training, momentum, eps, torch.backends.cudnn.enabled)
    return f(input, weight, bias)

I saw a strange malfunction. Always their values were 0 and 1 respectively, regardless of any initialization (as I did above):

(Pdb) running_mean

 0
 0
 0
 0
[torch.FloatTensor of size 4]
(Pdb) running_var

 1
 1
 1
 1
[torch.FloatTensor of size 4]

Could you please check it?

cjolivier01 · June 5, 2017, 10:28pm

If you have training as False, then it’s most likely not updating the running values, since it thinks that it is doing inference only, in which case the running mean and var are conceptually only read-only.

ruotianluo · June 6, 2017, 12:03am

What’s your build version. My 0.1.12 works correctly.

chenyuntc · June 6, 2017, 2:30am

bn.running_mean and bn.running_var are tensors

so as the problem in this post:

mderakhshani · June 6, 2017, 6:34am

Did you receive above value in batch_norm function? I have installed pytorch using anaconda and the version is:
pytorch 0.1.12 py35_2cu80 [cuda80] soumith

mderakhshani · June 6, 2017, 6:36am

What do you mean? Could you explain more?

mderakhshani · June 6, 2017, 6:56am

Depending on model.eval() call, you can choose between training mode and testing mode. So here, the training flag is True. So updating these variables can be possible.

chenyuntc · June 6, 2017, 6:59am

running_mean is tensor rather than Variable, which means you should not set attr data for tensor, it makes no sense.

import torch
import torch.nn as nn

bn = nn.BatchNorm2d(4)
bn.weight.data  = torch.arange(0,4)
bn.bias.data  = torch.arange(4,8)
bn.running_mean = torch.arange(8,12)
bn.running_var = torch.arange(12,16)
inp = torch.autograd.Variable(torch.randn(1,4,4,4))
out = bn(inp)
print bn.running_mean

mderakhshani · June 6, 2017, 7:02am

So it causes error:
TypeError: cannot assign 'torch.FloatTensor' as parameter 'weight' (torch.nn.Parameter or None expected)

chenyuntc · June 6, 2017, 7:03am

edited , see above ~

mderakhshani · June 6, 2017, 7:05am

Solved! Thank you so much