How to apply nn.BatchNorm1d to every layers of a nn.GRU module?

I’m learning GRU and DL, and I found that the benifit of BN(batch normalization) has being advocated everywhere in Internet.
I want to try it by myself, but I don’t know how to apply the nn.BatchNorm1d to my GRU module, because that the GRU module(nn.GRU) doesn’t give me a chance to normalize the inputs of each layer.
It looks like that we can only normalize(or modify) the input of the first layer, even though a GRU can have multiple hidden layers.

If I understand correctly, BN should be applyed to every layers of a NN.

Thanks!