Hi,
I am very new to Pytorch and this is my first post here, so I apologise if the question is very straightforward.
My problem is that I have defined class net1 and initialised it randomly with a manual seed.
random.seed(opt.manualSeed)
torch.manual_seed(opt.manualSeed)
if torch.cuda.is_available():
torch.cuda.manual_seed_all(opt.manualSeed)
class net1(nn.Module):
def __init__(self):
super(net1, self).__init__()
self.main_body = nn.Sequential(
nn.Conv2d(in_channels=3, out_channels=96, kernel_size=11, stride=4, padding=0),
# rest of the network...#
)
def forward(self, x):
output = self.main_body(x)
return output
# custom weights initialization called on net1
def weights_init(m):
classname = m.__class__.__name__
if classname.find('Conv') != -1:
m.weight.data.normal_(0.0, 0.02)
elif classname.find('BatchNorm') != -1:
m.weight.data.normal_(1.0, 0.02)
m.bias.data.fill_(0)
net1_ = net1()
net1_.apply(weights_init)
However, when I add another class net2 to the code:
and instantiate it, even though I do not use it anywhere else and it is not connected to my main graph (which is built on net1_), I get different outputs from my graph.
Is it a reasonable outcome?
I dont think this is related to net2. Some operations in the convolution layer are non-deterministic, and BatchNorm behaves differently based on different mini-batches.
If you inserted the code of net2 above net1, then what you see is also expected, because the weights of net2 get initialized, changing the state of the random number generator
Thanks for your answer.
Though I get very consistent outputs every time I run the code, when class _net2 is not defined.
I guess I see, that defining a new module anywhere probably changes the behaviour of random generator.
I have some problem in understanding the meaning of torch.manual_seed(opt.manualSeed). You didn’t call this variable later on, how does it help initialize the net? Which parameters does it initialize? Thanks.
To my understanding, setting a manual seed makes all the initializations in the code (e.g. weight initializations embedded in the network definitions, or wherever else in the code that you call a random generator) to be deterministic. In other words, given a fixed manual seed, they are supposed to generate identical values, any time you run the program.
If I manually set the seed and initialize three weight vectors of the same shape, then I am wondering whether all the initialized parameters for these three weight vectors will be the same?