I have a nn.Module that contains an LSTM whose number of layers is passed in the initialization. I would like to do Xavier initialization of its weights and setting the bias of the forget gate to 1, to promote learning of long-term dependencies. My problem is how to iterate over all the parameters in order to initialize them.
Doing something like
for name, param in lstm.named_parameters():
if 'bias' in name:
nn.init.constant(param, 0.0)
elif 'weight' in name:
nn.init.xavier_normal(param)
does not work, because param is a copy of the parameters in lstm and not a reference to them. This kind of loop can be used, for instance, to print the values of the parameters but not to modify them (as far as I know). Thank you in advance.
Did you try? Your snippet works perfectly well for me.
I was surprised at the other thread. The reason you cannot assign the loop variable is because then the loop variable name will just point to something else. You most certainly can modify the element you are looping over and it is exactly that what is sometimes thought of as surprising when you loop over a list of lists and append to the inner variables or somesuch.
In fact, looping over model.parameters() way above is also how the optimizers get the parameters they are optimizing…
Hi! I have a question. Don’t you need to do ‘xavier_uniform_’ instead of ‘xavier_uniform’? Aren’t functions with underscore the ones that modify the weights in-place?
I want to load a rnn pretrained model from a dict(named model_pretrained), param changed. But when I print the model parametes, I found rnn parameters not change.
for name, param in model.rnn.named_parameters():
print("0", param)
param = torch.nn.Parameter(model_pretrained["rnn."+name])
print("1", param)