From the documentation of batchnorm, “affine – a boolean value that when set to true, gives the layer learnable affine parameters. Default: True”. So, when I set affine=False, does gamma and beta in Ioffe’s paper is 1 and 0 or propagated standard deviation and mean?

When `affine=False`

the output of `BatchNorm`

is equivalent to considering `gamma=1`

and `beta=0`

as constants.

By assigning `affine=False`

, are the parameters of `gamma`

and `beta`

still learnable or are they fixed to constant values of `gamma=1`

, `beta=0`

?

By the way, how could I assign the initial gamma and beta values if assigining `affine=False`

does not mean to initial `gamma`

and `beta`

but to fix the values?

`affine = False`

is equivalent to simply computing:

`y = (x - mu) / sqrt(var + eps)`

where, `mu`

is the running (propagated) mean and `var`

is the running (propagated) variance. Equivalently, this can be interpreted as fixing `gamma=1`

and `beta=0`

(These will then be non-trainable. Since they don’t appear in the equation above, no gradients will be calculated for those).

If you rather want to initialize gamma, beta to (1, 0) and train them, you’d want to perform something like:

```
bn = nn.BatchNorm1d(num_c, affine=True)
bn.weight = 1
bn.bias = 0
```

The formula you’ve given would be used for `affine=False`

. I guess you have a typo in your post.

Ah. My bad. Thanks for pointing that out.

Hi there,

What if I init the model parameters from a pre-trained model (e.g. .pth file) and want to keep the gamma and beta frozen while changing mean and var when resuming the model?

Thank you.

You could set the `.requires_grad`

attributes of the `.weight`

and `.bias`

parameters to `False`

and keep this layer in `.train()`

mode.