Hi,
I’m using an LSTM model and would like run batchnorm layer on sequences (of different lengths) before passing them to the LSTM.
batchnorm doesn’t get padded sequences, and if I will run batchnorm before the values of the padded 0s will change and this will interfere with the padding.
what will be the best approach?
Thanks!