I’m trying to understand the philosophy of pytorch, and want to make sure what’s the right way to implement running mean logic like in batch normalization with pytorch.
I’ve read the source code of _BatchNorm
class, unfortunately the python code stops at F.batch_norm()
, from there code goes into binary.
Is this the right way to implement running mean in a module simply as:
self.register_buffer('running_mean', torch.Tensor(feature_dim))
then in the forward function:
self.running_mean += alpha * batch_mean
I’m cautious about this because I’m an old Theano user, where the update of a variable must be explicitly defined and passed to theano.function()
interface, and directly +=
is not allowed there.