I guess if we add bias=True
on layers after batch_norm layers then there shouldn’t be any issue right? At least that’s what I understand from the last layers of torchvision.models.vgg16_bn()
:
(avgpool): AdaptiveAvgPool2d(output_size=(7, 7))
(classifier): Sequential(
(0): Linear(in_features=25088, out_features=4096, bias=True)
(1): ReLU(inplace=True)
(2): Dropout(p=0.5, inplace=False)
(3): Linear(in_features=4096, out_features=4096, bias=True)
(4): Dropout(p=0.5, inplace=False)
(5): Linear(in_features=4096, out_features=1024, bias=True)
)