Any purpose to set bias=False in DenseNet torchvision?

John1231983 · July 31, 2018, 6:55am

Hello all, I have read implementation of densenet in torch vision example. I found that they set bias=False in convolution after batch norm layer. It looks redundance because the batch norm did it. Do we have any reason why they add it? Thanks

github.com

pytorch/vision/blob/b51a2c316cf80b8cb5f708dab51fbfa0f83f73d2/torchvision/models/densenet.py#L132


            del state_dict[key]
    model.load_state_dict(state_dict)
return model




class _DenseLayer(nn.Sequential):
def __init__(self, num_input_features, growth_rate, bn_size, drop_rate):
    super(_DenseLayer, self).__init__()
    self.add_module('norm1', nn.BatchNorm2d(num_input_features)),
    self.add_module('relu1', nn.ReLU(inplace=True)),
    self.add_module('conv1', nn.Conv2d(num_input_features, bn_size *
                    growth_rate, kernel_size=1, stride=1, bias=False)),
    self.add_module('norm2', nn.BatchNorm2d(bn_size * growth_rate)),
    self.add_module('relu2', nn.ReLU(inplace=True)),
    self.add_module('conv2', nn.Conv2d(bn_size * growth_rate, growth_rate,
                    kernel_size=3, stride=1, padding=1, bias=False)),
    self.drop_rate = drop_rate


def forward(self, x):
    new_features = super(_DenseLayer, self).forward(x)
    if self.drop_rate > 0:

ptrblck · July 31, 2018, 9:50am

Could you explain your concern a bit?

Usually the bias is removed in conv layers before a batch norm layer, as the batch norm’s beta parameter (bias of nn.BatchNorm) will have the same effect and the bias of the conv layer might be canceled out by the mean subtraction.

From the batch norm paper:

Note that, since we normalize Wu+b, the bias b can be ignored since its effect will be canceled by the subsequent mean subtraction (the role of the bias is subsumed by β in Alg. 1).

John1231983 · July 31, 2018, 10:48am

Sorry i mistaken. Bias= false means does not set bias and default is True. Forget it. Thanks so much

pvardanis · March 11, 2020, 7:36pm

I guess if we add bias=True on layers after batch_norm layers then there shouldn’t be any issue right? At least that’s what I understand from the last layers of torchvision.models.vgg16_bn():

(avgpool): AdaptiveAvgPool2d(output_size=(7, 7))
(classifier): Sequential(
(0): Linear(in_features=25088, out_features=4096, bias=True)
(1): ReLU(inplace=True)
(2): Dropout(p=0.5, inplace=False)
(3): Linear(in_features=4096, out_features=4096, bias=True)
(4): Dropout(p=0.5, inplace=False)
(5): Linear(in_features=4096, out_features=1024, bias=True)
)

ptrblck · March 11, 2020, 7:43pm

There shouldn’t be any error regarding the code. However, as mentioned above you might save the computation in specific use cases.