Using batch normalization in RNN

Younggun_Lee · August 10, 2017, 5:01am

Hello,
I am applying batch normalization (BN) in RNN.
I want to add BN between every layers in stacked RNN. (Ex. RNN with 3 layers)
Then is it correct to implement in the module as follows?

1st case>
I want gamma and beta of BN to be same at every time-step, but running average, variance to be different.

# define BN layers in __init__()
bn_list = nn.ModuleList()
layer = nn.BatchNorm1d(hidden_size)
bn_weight = layer.weight
bn_bias = layer.bias

for i in range(num_time_steps):
    l = nn.BatchNorm1d(hidden_size)
    l.weight = bn_weight
    l.bias = bn_bias
    bn_list.append(l)

# Then use elements of bn_list for each time-step in forward() method

2nd case>
I want running average and variance as well as gamma and beta of BN to be same at every time-step.

# define BN layer in __init__()
layer = nn.BatchNorm1d(hidden_size)

# Then use same 'layer' for each time-step in forward() method

For the 2nd case, I am worried about whether it will compute running average and variance correctly.

bgenchel · February 8, 2019, 6:55am

did you ever solve this?