BatchNorm Learnable Parameters

In the batch normalization’s pre-activation scaling, are the gamma and beta parameters learnable?

BatchNorm layers define trainable parameters by default, where the weight corresponds to the gamma parameter from the original paper and the bias corresponds to the beta parameter.

1 Like