Hi
I was trying to implement my own resnet model by using the model already provided by Pytorch as a reference. I noticed however that none of the convolutional layers had biases, as declared here:
Oh! In that case what are the weights in the batchnorm layer?
I was under the impression that bias and weights in the batchnorm were the expectation and variance respectively.
In the paper section 3.4, last sentence.
Whereas Dropout (Srivastava et al., 2014) is typically used to reduce overfitting,
in a batch-normalized network we found that it can be either removed or reduced in strength.