Need Help in choosing pytorch normalization function for Normalizing utterence based time series data

Sasank_Kottapalli · June 4, 2020, 8:44am

I am working in audio domain temporal convolutions.
I have a 2d tensor N*K where N is features and K is time frames.

Which default pytorch normalization should I use for the below code?

class ChannelwiseLayerNorm(nn.Module):
"""Channel-wise Layer Normalization (cLN)"""
def __init__(self, channel_size):
    super(ChannelwiseLayerNorm, self).__init__()
    self.gamma = nn.Parameter(torch.Tensor(1, channel_size, 1))  # [1, N, 1]
    self.beta = nn.Parameter(torch.Tensor(1, channel_size,1 ))  # [1, N, 1]
    self.reset_parameters()

def reset_parameters(self):
    self.gamma.data.fill_(1)
    self.beta.data.zero_()

def forward(self, y):
    """
    Args:
        y: [M, N, K], M is batch size, N is channel size, K is time-frames
    Returns:
        cLN_y: [M, N, K]
    """
    mean = torch.mean(y, dim=1, keepdim=True)  # [M, 1, K]
    var = torch.var(y, dim=1, keepdim=True, unbiased=False)  # [M, 1, K]
    cLN_y = self.gamma * (y - mean) / torch.pow(var + EPS, 0.5) + self.beta
    return cLN_y

ptrblck · June 5, 2020, 6:51am

I haven’t tested your implementation, but wouldn’t the code be equal to nn.BatchNorm1d with track_running_stats=False?

Sasank_Kottapalli · June 5, 2020, 8:13am

Does it have N gamma and beta parameters as present in the code? Also dont need running mean and variance.

ptrblck · June 5, 2020, 8:37am

Yes, gamma and beta are the affine parameters, which are created by default as weight and bias.
You can disable them by specifying affine=False, but they will be created by default.