Use of torch.manual_seed still results in different initial weights each time

Arun_Vishwanathan · June 29, 2019, 5:21am

I am new to PyTorch. As part of my project, I have some code where I construct an OrderedDict incrementally at runtime depending on the layers that are needed as part of the model. Now the way this code is structured is as ,

growingordereddict = {}
for layers in total_layer_list:
 # depending on the layer create the corresponding nn module and add to ordered dictionary.
 # In Pytorch you need to know the number of in_channels for a module with parameters such as 
#Conv and Dense. In order to obtain the number of in channels, the hack I use is to within the loop,
#construct a temp model with the orderedDict present so far and find the output size. This output 
#size will be the input channel to the next layer

# for example,
# tempmodel = nn.Sequential(growingordereddict)
 # out = tempmodel(torch.randn(1,3,5,5)) #model takes fixed size input
 # size = out.size()

#Use this size when constructing the next nn module as in_channels

After this loop is done, the final OrderedDict will have all the layers.

At the end outside the loop I do this,

torch.manual_seed(1)
model = nn.Sequential(growingorderedict)

However, I notice that on running each time, the initial weights are different. They don’t seem to have a fixed value corresponding to seed 1. Why is my code getting different initial weights for the layer each run, though I use a fixed seed of 1?

I noticed that if I have a sample code with the OrderedDict predefined and manual seed 1, the weights are the same each run. But in the case above where I have a for loop in which I instantiate a model many times to get the output size to construct the next layer, the manual seed set outside the loop to instantiate the final model does not seem to fix the model weights? What is the connection and why is this happening?

InnovArul · June 29, 2019, 6:20pm

It will be helpful to debug if you share the code.

Arun_Vishwanathan · June 29, 2019, 6:22pm

thanks Arul. Unfortunately it is a bit proprietary with many modules and hence it would be difficult to share. What are you particularly looking for? currently when I print model.fc.weight after initializing the model for example if the dense layer is named fc, each run of the program the weights are different.
This does not happen if say I had a predefined ordereddict with that seed. But this I notice only when incrementally adding on to the ordered dictionary and instantiating the model after the for loop.

As a pseudo sample, imagine a model with just Conv2d, flatten and Linear.

the for loop would look like this,

ordereddict = {} 
prev_output_size = 3
for layer in layer_list:
        ordereddict = OrderedDict(ordereddict.items() + constructmodule(layer,prev_output_size)
        # instantiate a temp model here to know the output size from the layer
        tempmodel = nn.Sequential(ordereddict)
        prev_output_size = tempmodel(torch.rand(1,3,5,5)).size()

constructmodule will return a nn.module depending on the layer.

at the end of the loop I do this,

torch.manual_seed(1)
finalmodel = nn.Sequential(orderedict)
print(finalmodel.fc.weight) # this is different each time

InnovArul · June 29, 2019, 6:28pm

In my understanding, ideally, it should not be different assuming that you are recreating growingorderedict after setting the seed.

You could first try to reproduce the behavior without any proprietary modules. If possible, share a short snippet when you can reproduce. Make sure that the random seed is not consumed by any other module after you set the seed.

Arun_Vishwanathan · June 29, 2019, 6:29pm

thanks Arul, please check my updated response above

InnovArul · June 29, 2019, 6:34pm

I see. When you create ordereddict, the weights are already initialized for those modules. nn.Sequential is just a container that holds the modules, but it does nothing to initalize the weights. The final torch.manual_seed(1) is not having any effect on weights in your code.

Arun_Vishwanathan · June 29, 2019, 6:41pm

ah! thanks, so what would be my way around this? Do I need to create a new ordered dictionary which is a deep copy of the ordereddict to instantiate the final model?

InnovArul · June 29, 2019, 6:43pm

I would say, note down the sizes during the for loop and after the for loop, create a clean model by setting the seed.

or you can reset the parameters at the end:

Arun_Vishwanathan · June 29, 2019, 6:50pm

thanks Arul that’s nice to know! I did not realize that when the ordered dictionary is constructed the weights are already initialized. Does that also mean if I move the line torch.manual_seed(1) to the first line of the for loop I should be good? Because in the last iteration of the loop, the ordered dictionary contains all the layers needed for the model.

Secondly, the weight reset is also useful! But in that case, after resetting, how do I set it to what a particular seed would have done? like say after reset, set the weights to what would have been for seed(1)?

InnovArul · June 29, 2019, 7:00pm

if I move the line torch.manual_seed(1) to the first line of the for loop I should be good?

Not really. Again, the module’s weights are initialized as and when they are created. This is not a fool-proof strategy.

how do I set it to what a particular seed would have done?

Resetting weights after setting torch.manual_seed() is equivalent to creating a new layer.

import torch.nn as nn 
import torch 
torch.manual_seed(2) 
x = nn.Conv2d(3,2,3) 
y = nn.Conv2d(3,2,3) 

torch.manual_seed(2) 
y.reset_parameters() 
torch.allclose(x.weight, y.weight)  # True

Arun_Vishwanathan · June 29, 2019, 7:06pm

brilliant! Thanks Arul!