How to implement a running mean logic in pytorch?

I’m trying to understand the philosophy of pytorch, and want to make sure what’s the right way to implement running mean logic like in batch normalization with pytorch.

I’ve read the source code of _BatchNorm class, unfortunately the python code stops at F.batch_norm(), from there code goes into binary.

Is this the right way to implement running mean in a module simply as:
self.register_buffer('running_mean', torch.Tensor(feature_dim))
then in the forward function:
self.running_mean += alpha * batch_mean

I’m cautious about this because I’m an old Theano user, where the update of a variable must be explicitly defined and passed to theano.function() interface, and directly += is not allowed there.

Yes. you are on the right track.

You register a buffer with self.register_buffer for your running mean, and then you will take the mean of your input batch via: batch_mean = input.data.mean(...).

Please note the part input.data, which will give you direct access to the Tensor backing the input Variable.

1 Like