According to the documentation for torch.nn.InstanceNorm1d, when affine is set to True, the beta (additive) and gamma (multiplicative) parameters are learnable.
When affine is set to False, should we infer that beta and gamma are simply absent (i.e., functionally 0 and 1, respectively)?
Also, no details are given about the parameter initialization: when affine is True, how are they initialized? Can the initialization be changed?
(The same question applies to the 2d version, if the answers are different.)
Alas, that code is rather ambiguous, referencing weights and bias rather than gamma and beta.
Developers, would I be correct in thinking that bias is beta and weight is gamma?
Weight and bias should correspond to gamma and beta as in the nn.BatchNorm case.
Although the parameter are named differently, this creates a consistent API naming scheme.
Thanks for clarifying the question, as I’ve missed that part.
Yes, as can be seen in this line of codeat::batch_norm will be called, where the multiplicative factor gamma corresponds to weight and the additive value beta corresponds to bias.