Parameters initialisation

Siddharth_Shrivastav · June 20, 2018, 5:48pm

how to initialise all the layers in network with xavier initialisation

ptrblck · June 20, 2018, 5:51pm

def weights_init(m):
    if isinstance(m, nn.Conv2d):
        torch.nn.init.xavier_uniform(m.weight.data)

Alternatively, you could use with torch.no_grad(): and remove the .data call.

Siddharth_Shrivastav · June 20, 2018, 5:51pm

but this works only for the first layer isn’t it?

ptrblck · June 20, 2018, 5:52pm

No, all layers are applied using:

model.apply(weights_init)

Siddharth_Shrivastav · June 20, 2018, 5:53pm

Thank you very much sir

Siddharth_Shrivastav · June 21, 2018, 5:59am

If for some hyperparameter calculation I am iterating over the network again and for each iteration, I want to reset the weights of all the parameters of the network to be same as without any training then how can I achieve this?
your above solution only initialise the weights once after that it does not reset the weights to xavier initialisation on every successive iteration.

ptrblck · June 21, 2018, 9:06am

You can call model.apply(weights_init) in your training loop, if you want.

for data, target in train_loader):
    # your training code
    ...
    optimizer.step()
    ...
    model.apply(weighs_init)

Are you sure you want to reset your weights after every iteration?

If you really need the exact same weights, you would need to set the seed before calling weights_init:

torch.manual_seed(YOUR_SEED)
model.apply(weights_init)

for data, target in train_loader:
    ...
    torch.manual_seed(YOUR_SEED)
    model.apply(weights_init)

Siddharth_Shrivastav · June 26, 2018, 7:33am

consider the case that I am plotting the graph between different learning rate vs. respective loss so in this case, I would require to initialize the weight each time for the new learning rate.
So what is the easy method to do this?
After using your above method the loss for the first iteration for each value of learning rate should be same but for my code that is not the case please help?

ptrblck · June 26, 2018, 9:36am

The easiest way would be to set a random seed, initialize the model once and save its state_dict.
Once your experiment is finished, just load the state_dict and your model will have the same weights again.

If you might want to exclude all random effects, you probably want to set shuffle=False in your DataLoader, since this could also change the results.
Also, you could set torch.backends.cudnn.deterministic = True, but that could slow down your code a lot.

Xin_Niu · October 30, 2018, 3:54am

Hi, is the function weights_init a method of the network class?
And does it works for nn.Module?

Thanks.

ptrblck · October 30, 2018, 9:44am

I defined the function in my first post.
It does work for modules, such that you can call model.apply(weights_init).
In the method you could use conditions to initialize different module classes with different init functions.

Xin_Niu · October 31, 2018, 7:48pm

Thanks for clarification!

I defined the function before the network class, and call it in a method fit in the class, and it worked.