You cannot use batchnorm layers with a single sample, if the temporal dimension also contains only a single time step as seen here:
bn = nn.BatchNorm1d(3)
x = torch.randn(1, 3, 10)
out = bn(x)
x = torch.randn(1, 3, 1)
out = bn(x) # error
since the mean would just be the channel values and the stddev cannot be calculated.
I’m not sure, if any normalization layer would make sense in such a use case, but lets wait for others to chime in.