Why does the resnet model given by pytorch omit biases from the convolutional layer?

Hi
I was trying to implement my own resnet model by using the model already provided by Pytorch as a reference. I noticed however that none of the convolutional layers had biases, as declared here:

def conv3x3(in_planes, out_planes, stride=1):
    "3x3 convolution with padding"
    return nn.Conv2d(in_planes, out_planes, kernel_size=3, stride=stride,
                                    padding=1, bias=False)

Is there any reason for this? I didn’t see any mention of removing biases in the original paper.

See author’s answer here:

The batchnorm layers handle the biases.

7 Likes

Oh! In that case what are the weights in the batchnorm layer?
I was under the impression that bias and weights in the batchnorm were the expectation and variance respectively.

In the original batchnorm paper they mention learnable params beta and gamma:

And in pytorch the batchnorm implementation has weights and bias in addition to running mean and running standard deviation.

http://pytorch.org/docs/master/_modules/torch/nn/modules/batchnorm.html

Edit: The weights and bias are used to scale and shift the output of the layer.

2 Likes

Okay, I get it now.

Thanks! This clears everything up.

1 Like

In the paper section 3.4, last sentence.
Whereas Dropout (Srivastava et al., 2014) is typically used to reduce overfitting,
in a batch-normalized network we found that it can be either removed or reduced in strength.