Weight initilzation

@lakehanne Iirc, Python allows something like 256 recursive calls, either you forgot a finishing condition in your code or you need another approach (iterator/generator?)

@Atcold Hi, can we simply use model2’s weights to initialize model1’s layer by
model1.conv1.parameters = model2.conv1.parameters ? Thank you.

@Xiaoyu_Liu You can check this for example. Further more, if you want to copy by module names, you can choose named_parameters() or state_dict(), and make sure of the deep copy.

where is the implementation of Xavier?

is the xavier:

m.weight.data.normal_(0, math.sqrt(2. / n))

???

Here you go:

how did u know that apply existed? doesn’t seem to be documented wherever Im looking.

I guess one can find the official function in the docs:

http://pytorch.org/docs/master/nn.html?highlight=xavier#torch.nn.init.xavier_normal

You’re right, apply is not documented. I’ll open a PR.
Updated: https://github.com/pytorch/pytorch/pull/2327

1 Like

module.weight.data.copy_(everythings_you_want)
conv2d weights [Channels, Groups, Height, Width]

It seems that there are only conv2 layer’s initilzation , not linear layer’s in resnet model. How about linear layer? Is there any default initilzation if I don’t define the initilzation in my model?

Thank you!

Looks like glorot uniform is the default initialization for nn.Linear. Check out reset_parameters().

Code god! My I ask you question—>How can I INIT a LSTM or CNN. I have known that A LSTMCell can be INTIed as <nn.init.xavier_uniform(LSTMCell.bias_ih)> OR <nn.init.xavier_uniform(LSTMCell.bias_hh)> . BUT LSTM can’t…that.

according to dir(net) there are two weight matrices:

weight_hh_l0
weight_ih_l0

so you could do

def initialize_weights(model):
    if type(model) in [nn.Linear]:
        nn.init.xavier_normal(model.weight.data)
    elif type(model) in [nn.LSTM, nn.RNN, nn.GRU]:
        nn.init.xavier_normal(model.weight_hh_l0)
        nn.init.xavier_normal(model.weight_ih_l0)

This is also documented here. So if you have more than 1 recurrent layer, you’ll have to initialize 2 * recurrent_layers weight matrices. Not sure how to elegantly iterate over them, using list(parameters()) and only changing 2 dimensional tensors (i.e. not biases) could be a solution.

1 Like

how about multi-Layer? I can Init one Layer yet, but I cant Inti multi-layer by using a “loop” , too. And I have tried some way, but it doesn’t make sense…

You can call the weight_init method on your model using model.apply(weight_init).

3 Likes

Does batch normalization help in different weight initializations.ensure

you include at least one of Xavier’s and Kaiming’s initializations.can any one please help me for this

Hi
I read these answer just now.I have a question that:
With

a = nn.Linear(2,3)

I got a fc layer with 2d input and 3d output, and the params in the layer are from a N(0, 1) without my initialization , i.e. the fc layer is initialized automatically.

Can i specify the initialization method like xavier when i initial the layer,or i have to got the params iteratelly and use a function to change it or with the help of apply method?

The Linear layer will return a 2 dimensional output with 3 output features.
It’s initialized automatically using the uniform_ distribution. You can see the initialization here.

You could override the layer and implement your method or just use apply to init it with xavier or another method.

no such thing as should, ask any is ok