Hello folks,
I am a little bit confused about the weight initializations. As I am training my network (resnet18) from scratch, I like to initialize the weights by an uniform distribution, with the following function:
def init_weights(m):
if isinstance(m, nn.Conv2d):
nn.init.kaiming_uniform_(m.weight, nonlinearity='relu')
elif isinstance(m, nn.Linear):
nn.init.kaiming_uniform_(m.weight, nonlinearity='relu')
elif isinstance(m, nn.BatchNorm2d):
nn.init.kaiming_uniform_(m.weight, nonlinearity='relu')
elif isinstance(m, nn.MaxPool2d):
nn.init.kaiming_uniform_(m.weight, nonlinearity='relu')
elif isinstance(m, nn.AdaptiveAvgPool2d):
nn.init.kaiming_uniform_(m.weight, nonlinearity='relu')
Do I need to initialize every layer like this or is there an easier way? And is it necessary to zero the bias? When I do it with the function above, I get a VaulueError. Am I missing something?
ValueError: Fan in and fan out can not be computed for tensor with fewer than 2 dimensions
EDIT: Obviously the BatchNorm2d layers have problems and can not be computed with this initialization. Do I have to do another initialization for these kind of layers? What about the pooling layers, do I need to initialize them at all?