Deleting an unused part of a module changes the output

In a fairly large model, I have a few linear layers which I was not using, so I commented them out and surprisingly my Average Precision for first epoch moves from 34 to 33. My seeds are fixed so I know what to expect in epoch if I remove the comments then it will again predict Average Precision Score as 34.
Is there any reason for that. I have added a toy example below:

First Example:
class test(nn.Module):
    def __init__(self):
      super(test,self).__init__()
      self.lin_unused=nn.Sequential(nn.Linear(1024, 1024),nn.ReLU(),)
      self.lin_used=nn.Sequential(nn.Linear(1024, 512),nn.ReLU(),)
    def  forward(input):
      output=self.lin_used(input)
Second Example:
class test(nn.Module):
    def __init__(self):
      super(test,self).__init__()
      #self.lin_unused=nn.Sequential(nn.Linear(1024, 1024),nn.ReLU(),)
      self.lin_used=nn.Sequential(nn.Linear(1024, 512),nn.ReLU(),)
    def  forward(input):
      output=self.lin_used(input)         

AP for First Example : 34,
AP for Second example:33

Hi Asm!

By default, nn.Linear initializes itself with some pseudorandom
numbers. Thus, lin_unused consumes some pseudorandom
numbers and the pseudorandom numbers that lin_used uses
for its initialization will differ, depending on whether lin_unused
is there or not.

You can check this in a number of ways:

First, print out some of the weights in lin_used and see that
they differ in the two cases.

Second, call torch.manual_seed with a fixed, known seed
immediately before constructing lin_used and check that
your results are the same whether or not you comment out
lin_unused.

Third, run your whole script a bunch of times (constructing and
running your model from scratch each time), passing a different
seed to torch.manual_seed at the beginning of your script
each time, and verify that the results of your two versions
(with and without lin_unused) are statistically the same,
even though individual runs will differ from one another in
detail, due to statistical variation.

Best.

K. Frank

2 Likes

Thats an excellent answer. Thanks a lot for clearing that up