As far as I understand the documentation for BatchNorm1d layer we provide number of features as argument to constructor(
nn.BatchNorm1d(number of features)).
As an input the layer takes (N, C, L), where N is batch size (I guess…), C is the number of features (this is the dimension where normalization is computed), and L is the input size.
Let’s assume I have input in following shape:
(batch_size, number_of_timesteps, number_of_features)
which is usual data shape for time series if batch_first=True.
Should I transpose the input (swap dimension 1 and 2) before running the batch normalization?
In this case I will have to transpose the output again to use it in RNN later. It looks quite weird to me.
Can someone please take a look at below example and let me know if this is the proper way.
import torch from torch import nn # data (batch size, number of time steps, number of features) x = torch.rand(3, 4, 5) # layers bn = nn.BatchNorm1d(5) rnn = nn.RNN(5, 10, 1, batch_first=True) # computation - transpose TWICE x_normalized = bn(x.transpose(1, 2)).transpose(1, 2) rnn(x_normalized)