TF LayerNormalization vs PyTorch LayerNorm

In Tensorflow’s implementation of LayerNormalization here, we can initialize it within the __init__ function of a module since it doesn’t require an input of the normalized shape already. I might be understanding this incorrectly, but PyTorch’s LayerNorm requires the shape of the input (output) that requires layer normalization, and thus since with each batch, I deal with different sequence lengths, which affects the shape of the output, I suppose I can’t initialize this within my own module?

1 Like