The initialization is layer-dependent. How does pytorch seed the RNGs by default?
If I have to train a model N times to see the average performance of the model, do I have to have special code to ensure that the initializations are different? Can I assume that I have different initialization on each training run?