Different train result when just defining extra layers in code!

Hi, I face a strange issue. I am training a simple fully connected network on mnist dataset.
I have used ‘torch.manual_seed’ for code reproducabiliity.
In general, I got an accuracy like acc = [0.3722, 0.3514, 0.4277] in the first three epochs.
When I add a definition of an extra network (which I do not use in training and I do not feed it to optimizer), I get different results!!
I check my code multiple times, but just commenting the line of the definition of an extra network returns the results to the general case even if I have used random shuffle in dataloader.

Initializing an additional layer will initialize all parameters by calling into the pseudorandom number generator and will thus change all following random calls. This is expected behavior and we had some topics about the same behavior here. :wink:

What should I do about this? I want to do some other operation while not affecting the base training?
besides, Could you please link me to the other issues? :pray:

Interesting topics might be:

You would have to avoid calling into random methods (not possible when randomly initializing new layers) or (re-)seed the code after initializing the model so that the training loop is directly seeded.

1 Like