Understanding need for reinitializing weights in code

Upgrade_Yourself · November 21, 2019, 1:29pm

I’ve been reading a piece of code found on github. I specifically don’t understand why we keep on verifying if the model’s layers have got weights and bias (is not None) when we’ve just created it. And why would we do (nn.init.uniform_(module.weight)) if there are weights. What’s the logic here? Thanks

def get_head(nf: int, n_classes):

    model = nn.Sequential(
    nn.ReLU(),
    nn.AdaptiveAvgPool2d(1),
    Flatten(),
    nn.BatchNorm1d(nf),
    nn.Dropout(p=0.25),
    nn.Linear(nf, n_classes)
    )
    for i, module in enumerate(model):
        if isinstance(module, (nn.BatchNorm1d, nn.BatchNorm2d)):
             if module.weight is not None:
                  nn.init.uniform_(module.weight)
             if module.bias is not None:
                  nn.init.constant_(module.bias, 0)
        if isinstance(module, nn.Linear):
             if getattr(module, "weight_v", None) is not None:
                  print("Initing linear with weight normalization")
                  assert model[i].weight_g is not None
             else:
                 nn.init.kaiming_normal_(module.weight)
                 print("Initing linear")
             if module.bias is not None:
                 nn.init.constant_(module.bias, 0)

return model

albanD · November 21, 2019, 2:57pm

Hi,

This is because the batchnorm can be created without the affine transformation (see the code here). In that case, the weight and bias will be None.
For the linear layer, the .bias can be None for the same reason.
For the weight_v attribute. I guess this is something that the user added himself as it does not exist in the original Linear layer.

albanD · November 21, 2019, 4:29pm

torch.Tensor(num_features) returns uninitialized memory. So you definitely want to put some sensible values in there before using it