Even if I wrote self.linear_1 & self.linear_2, they are not under forward command. However, I’ll get different results if I remove them. I feel it’s weird since they are not in the model. Can anyone explain it?
That’s expected since the order of calls into the pseudo-random number generator (PRNG) differs if you are randomly initializing these layers even if they are never used in the forward method.
Seeding will guarantee that the same sequence of PRNG calls will generate the same random number sequence.
Yes, you are right that there is no proper way of “fixing” the expected noise by changing random seeds.
You should expect to see the same behavior in your final model accuracy by simply using another initial seed. If the stddev of the final accuracy is too large, you would have to check your training routine and try to stabilize the training.