Updating batch normalization momentum

marcman411 · February 7, 2018, 5:29pm

Similar to a learning rate schedule, it seems a fair number of networks implemented in TensorFlow use a momentum schedule for batch normalization. Is it possible to do something similar in PyTorch, without losing the running mean/variance?

SimonW · February 7, 2018, 5:48pm

You can change the batch_norm_obj.momentum attribute, or use the functional form F.batch_norm http://pytorch.org/docs/master/nn.html#torch.nn.functional.batch_norm

marcman411 · February 7, 2018, 5:54pm

How does one change the momentum attribute for batch norm included as a layer?

SimonW · February 7, 2018, 5:56pm

Not sure if I understand your question, but I’d do bn.momentum = 0.01

marcman411 · February 7, 2018, 5:59pm

Sorry, let me clarify.

I have network components defined in blocks like

self.block1 = nn.Sequential(
    nn.Conv1d(3,64,1),
    nn.BatchNorm1d(64),
    nn.ReLU())

How can I modify the momentum of those batch norm layers in the network?

SimonW · February 7, 2018, 6:24pm

Oh I see. You can use python indexing to get the layers in a nn.Sequential container. So it would be self.block1[1].momentum = .... If you want to change momentum for all BN modules in a network, you can iteration through your networks net.modules and test if it is a BN module.

marcman411 · February 7, 2018, 6:39pm

That did the trick. Thanks!