Default weight initialisation for Conv layers (including SELU)

Hi,

For the first question, please see these posts:

  1. Clarity on default initialization in pytorch
  2. CNN default initialization understanding

I have explained the magic number math.sqrt(5) so you can also get the idea behind the relation between non-linearity and init method. Acuatlly, default initialization is uniform.

Also, see this reply in the github thread about it https://github.com/pytorch/pytorch/issues/15314#issuecomment-477448573

About the second question, you can reinitialize weights after initializing them using default values. To do so, you can create your init function similar to available cases in torch.nn.init package and use a code similar to following snippet:

def init_weights(m):
    """
    Initialize weights of layers using Kaiming Normal (He et al.) as argument of "Apply" function of
    "nn.Module"
    :param m: Layer to initialize
    :return: None
    """

    if isinstance(m, nn.Conv2d):
        nn.init.kaiming_normal_(m.weight, mode='fan_out')
        nn.init.constant_(m.bias, 0)
    elif isinstance(m, nn.BatchNorm2d):
        nn.init.constant_(m.weight, 1)
        nn.init.constant_(m.bias, 0)

model.apply(init_weights)

Bests

1 Like