How are layer weights and biases initialized by default?

knowledge_unlimited · January 30, 2018, 8:21pm

I was wondering how are layer weights and biases initialized by default? E.g. if I create the linear layer torch.nn.Linear(5,100) How are weights and biases for this layer initialized by default?

ptrblck · January 30, 2018, 8:30pm

Linear layers are initialized with

stdv = 1. / math.sqrt(self.weight.size(1))
self.weight.data.uniform_(-stdv, stdv)
if self.bias is not None:
    self.bias.data.uniform_(-stdv, stdv)

See also here.

knowledge_unlimited · January 30, 2018, 10:07pm

Thanks! So it depends on the layer you use?

ptrblck · January 31, 2018, 11:48am

The layers are initialized in some way after creation. E.g. the conv layer is initialized like this.
However, it’s a good idea to use a suitable init function for your model.
Have a look at the init functions.
You can apply the weight inits like this:

def weights_init(m):
    if isinstance(m, nn.Conv2d):
        xavier(m.weight.data)
        xavier(m.bias.data)

model.apply(weights_init)

squirrel · March 9, 2018, 1:43am

So it won’t throw any error if I forget to initialize some conv layers?

ptrblck · March 9, 2018, 3:30am

Yes, it won’t throw any errors. Depending on your problem, training could be trickier.

spacemeerkat · August 20, 2018, 1:38pm

Is there a way to alter this code for a situation where you have nn.Conv2d layers whos bias can be on or off depending on their position in the network?

e.g. you have a first Conv2d with a bias term but then a later Conv2d with no bias term.

As the following will return an error:

    if isinstance(m, nn.Conv2d(bias=True):
        xavier(m.weight.data)
        xavier(m.bias.data)

ptrblck · August 20, 2018, 1:56pm

You could use a condition to check, if bias was set:

if isinstance(m, nn.Conv2d):
    torch.nn.init.xavier_uniform_(m.weight)
    if m.bias:
        torch.nn.init.xavier_uniform_(m.bias)

spacemeerkat · August 20, 2018, 2:03pm

If I try that I get the following error when using:

def weight_init(m):
    if isinstance(m, torch.nn.Conv2d) or isinstance(m, torch.nn.Linear):
        torch.nn.init.xavier_uniform_(m.weight)
        if m.bias: 
            torch.nn.init.xavier_uniform_(m.bias)

RuntimeError: bool value of Tensor with more than one value is ambiguous

ptrblck · August 20, 2018, 2:06pm

Sorry for the misleading code. It should be: if m.bias is not None:
Also, xavier_uniform will fail on bias, as it has less than 2 dimensions, so that fan_in and fan_out cannot be computed.

if isinstance(m, nn.Conv2d):
    torch.nn.init.xavier_uniform_(m.weight)
    if m.bias is not None:
        torch.nn.init.zeros_(m.bias)

spacemeerkat · August 20, 2018, 2:13pm

No no not at all, I should have been able to work that out for myself

Okay that seems to be okay except that I get the following error from the .zeros

AttributeError: module 'torch.nn.init' has no attribute 'zeros_'

Perhaps it’s an outdated atribute?

ptrblck · August 20, 2018, 2:17pm

I think it was introduced in the latest release, i.e. 0.4.1.
I would recommend to update to it or in case it’s not possible at the moment due to whatever reason, you could use:

with torch.no_grad():
    m.bias.zero_()

spacemeerkat · August 20, 2018, 2:20pm

Ahh I’m using 0.4.0 so I will update to the newest version.

That code you sent works on 0.4.0 though which is great, thanks for your help as always!

spacemeerkat · October 25, 2018, 3:42pm

Hi ptrblck,

I just wanted to follow up on this: If you were to use nn.Conv2d( ... ,bias = True) presumably the weight would be zeroed would it not? Because True is != None in Python language…

Therefore, you must either use bias = False, or don’t insert and bias information to nn.Conv2d? Does this sound right?

ptrblck · October 25, 2018, 5:09pm

If you don’t want to use the bias, you should set bias=False during the instantiation of the layer.
Are you somehow referring to its initialization? In my example I set the bias to zeros, if it’s available.
It’s still a learnable and used parameter in case you are wondering if the bias is useless afterwards.

spacemeerkat · October 26, 2018, 8:49am

Sorry, my mistake I understand now. I meant bias = False in my first sentence above but I was concerned that because in Python False is not None, that it would somehow try to attribute some bias initialisation to the layer even if you set it to False. But I assume False trumps the weight initialisation so that you are left with no bias, which is what you want.

ptrblck · October 26, 2018, 11:39am

Sorry for the confusion. In the construction of the conv layer you pass bias as a bool value (code).
If it is set to True (or anything that returns True in the line of code), self.bias will be initialized to the nn.Parameter.

Neda · November 20, 2018, 3:00pm

@ptrblck , in this script, the xavier algorithm will apply on all layers or only in nn.Conv2d?

ptrblck · November 20, 2018, 3:02pm

It depends on the condition you are using.
In the scripts I’ve posted in this thread, I’ve used if isinstance(m, nn.Conv2d), so it’ll be just used for nn.Conv2d layers.

You can of course add more conditions to it for other layers/parameters etc.

Neda · November 20, 2018, 3:18pm

@ptrblck, Thank you. I have a UNet model, then do I need to put if condition for each layer?