Hello,
I want to do batch norm over variable length sequences and I am curious how to do it properly so 0’th would not be taken into account for calculating mean/std? Or the only way is to implement batch norm layer myself to solve this issue?
Thank you in advance!