What's the effect if bn parameters passed to optim

I noticed that named_parameters return bn parameters too. What’s the effect if I pass bn parameters to optim?

The named_parameters of BatchNorm are the weight and bias, which relate to the gamma and beta from the BatchNorm paper.
These are the learnable parameters of the layer, which might eliminate the normalization performed by the running stats.
That’s expected behavior.