Parameters initialisation

how to initialise all the layers in network with xavier initialisation

def weights_init(m):
    if isinstance(m, nn.Conv2d):
        torch.nn.init.xavier_uniform(m.weight.data)

Alternatively, you could use with torch.no_grad(): and remove the .data call.

1 Like

but this works only for the first layer isn’t it?

No, all layers are applied using:

model.apply(weights_init)

Thank you very much sir

If for some hyperparameter calculation I am iterating over the network again and for each iteration, I want to reset the weights of all the parameters of the network to be same as without any training then how can I achieve this?
your above solution only initialise the weights once after that it does not reset the weights to xavier initialisation on every successive iteration.

You can call model.apply(weights_init) in your training loop, if you want.

for data, target in train_loader):
    # your training code
    ...
    optimizer.step()
    ...
    model.apply(weighs_init)

Are you sure you want to reset your weights after every iteration?

If you really need the exact same weights, you would need to set the seed before calling weights_init:

torch.manual_seed(YOUR_SEED)
model.apply(weights_init)

for data, target in train_loader:
    ...
    torch.manual_seed(YOUR_SEED)
    model.apply(weights_init)

consider the case that I am plotting the graph between different learning rate vs. respective loss so in this case, I would require to initialize the weight each time for the new learning rate.
So what is the easy method to do this?
After using your above method the loss for the first iteration for each value of learning rate should be same but for my code that is not the case please help?

The easiest way would be to set a random seed, initialize the model once and save its state_dict.
Once your experiment is finished, just load the state_dict and your model will have the same weights again.

If you might want to exclude all random effects, you probably want to set shuffle=False in your DataLoader, since this could also change the results.
Also, you could set torch.backends.cudnn.deterministic = True, but that could slow down your code a lot.

Hi, is the function weights_init a method of the network class?
And does it works for nn.Module?

Thanks.

I defined the function in my first post.
It does work for modules, such that you can call model.apply(weights_init).
In the method you could use conditions to initialize different module classes with different init functions.

Thanks for clarification!

I defined the function before the network class, and call it in a method fit in the class, and it worked.