Weight Initialization

I am trying to use xavier_uniform_ initialization for batch and layer normalization layers however I am receiving a valueerror: Fan in and fan out can not be computed for tensor with less than 2 dimensions. Is there anyway I can rectify this problem?

Also when I try to initialize my bias layers using:

for key, value in network.state_dict().items():
    if 'bias' in key:
        network.state_dict()[key] = torch.full_like(network.state_dict()[key], 0, requires_grad=True)
        # or
        network.state_dict()[key] = torch.full_like(network.state_dict()[key], 0)

When I go to print the state_dict after I initialized all the weights and biases, the biases still turn up to be random numbers instead of 0. Am I using the wrong function or reassigning the wrong variable? How can I fix this?

I think you would need to modify your current state dict and reload it in your model.
An alternative approach would be:

model = nn.Sequential(
    nn.Conv2d(3, 6, 3, 1, 1),

for child in model.children():
    if isinstance(child, nn.BatchNorm2d):
        with torch.no_grad():
            child.bias = nn.Parameter(torch.empty(6).uniform_())


I am trying to use kaiming and xavier initialization, but the initialized weights are not saving to the model. The code I am using to try accomplish this:

new_weights = collections.OrderedDict()
for key, value in network.state_dict().items():
            if 'bias' in key:
                new_weights[key] = torch.full_like(network.state_dict()[key], 0)
            elif 'ltsm' in key:
                new_weights[key] = torch.nn.init.xavier_uniform_(value, gain=5/3)
            elif 'batch' in key or 'norm' in key:
                new_weights[key] = torch.nn.init.xavier_uniform_(value)
                # Is the right initialization to use for batch and layer normalization?
                new_weights[key] = torch.nn.init.kaiming_uniform_(value, a=0, mode='fan_out', nonlinearity = 'relu')
        new_weights[key] = torch.full_like(network.state_dict[key], 0)

I initially tried to directly reassign the weights but that did not work.

Did you load the state dict afterwards?
Could you try my approach of iterating the children?

Thank you very much, iterating over the children worked!

I’ve been trying to find recommendations on how to initialize the batchnorm weights, biases, running_mean, and running_var and the layernorm weights and biases but can’t find any recommendations, do you know if there is any special way to initialize those layers?