How to use layer norm after con 1d layer?

vainaijr · December 30, 2019, 2:51pm

I think doing

x = torch.randn(1, 3, 6) # batch size 1, 3 channels, 6 length of sequence
a = nn.Conv1d(3, 6, 3) # in channels 3, out channels 6, kernel size 3
gn = nn.GroupNorm(1, 6)
gn(a(x))

tensor([[[-0.1459, 0.5860, 0.1771, 1.1413],
[-0.8613, 2.7552, -1.0135, 0.8898],
[-0.1119, -0.1656, -0.4536, -0.9865],
[ 0.6755, -1.3193, 1.2248, -0.5849],
[ 1.2789, -0.5229, 0.1345, 0.1763],
[-2.1555, 0.0149, -0.2769, -0.4565]]], grad_fn=)

is equivalent to

ln = nn.LayerNorm([6, 4])
ln(a(x))

tensor([[[-0.1459, 0.5860, 0.1771, 1.1413],
[-0.8613, 2.7552, -1.0135, 0.8898],
[-0.1119, -0.1656, -0.4536, -0.9865],
[ 0.6755, -1.3193, 1.2248, -0.5849],
[ 1.2789, -0.5229, 0.1345, 0.1763],
[-2.1555, 0.0149, -0.2769, -0.4565]]],
grad_fn=)

so we could do
nn.GroupNorm(1, out_channels)
and we will not have to specify Lout after applying Conv1d and it would act as second case of LayerNorm specified above.

So, to compare batchnorm with groupnorm or 2nd case of layernorm, we would have to replace

nn.BatchNorm1d(out_channels)

with

nn.GroupNorm(1, out_channels)