Hi,
For the first question, please see these posts:
I have explained the magic number math.sqrt(5)
so you can also get the idea behind the relation between non-linearity and init method. Acuatlly, default initialization is uniform.
Also, see this reply in the github thread about it https://github.com/pytorch/pytorch/issues/15314#issuecomment-477448573
About the second question, you can reinitialize weights after initializing them using default values. To do so, you can create your init function similar to available cases in torch.nn.init
package and use a code similar to following snippet:
def init_weights(m):
"""
Initialize weights of layers using Kaiming Normal (He et al.) as argument of "Apply" function of
"nn.Module"
:param m: Layer to initialize
:return: None
"""
if isinstance(m, nn.Conv2d):
nn.init.kaiming_normal_(m.weight, mode='fan_out')
nn.init.constant_(m.bias, 0)
elif isinstance(m, nn.BatchNorm2d):
nn.init.constant_(m.weight, 1)
nn.init.constant_(m.bias, 0)
model.apply(init_weights)
Bests