Unused model parameters affect optimization for Adam

After I move conv_list behind all other modules, it passes the test.
However, I m not sure this is valid in other cases.
For example I have an encoder and decoder and either of them will have variant depth. Then I have a large wrapper module for the encoder and decoder. Even if I move the initialization of conv_list in encoder and decoder itself. It might still have problems because of the wrapper module.

For the seed before instantiating each layer, could you give me an example?

I understand I will get differences but I am just wondering will this effect degrades the performance in general. If this only affects debugging, it might not be a big issue.