How are module weights initialized in Pytorch ?!

Setting the seed might not be enough to get exactly the same parameters.
Since one model might have more or other layers than the second one, the PRNG might be called differently.

I would suggest to initialize one model and copy all parameters to the other model. This would make sure that at least all common layers have the same parameters.
Here is a small example.

2 Likes