Training Reproducibility Problem

Note that there are known sources of randomness even for this case. The documentation has a section on it.

Best regards

Thomas

On that page, it says:

Completely reproducible results are not guaranteed across PyTorch releases, individual commits or different platforms. Furthermore, results need not be reproducible between CPU and GPU executions, even when using identical seeds.

This should be expected though, because cuDNN and PyTorch’s CPU variants use different algorithms for certain operations (e.g., a prominent example would be convolutional operations for which dozens of approximations exist). Also, difference between versions should be expected – could be also attributed to bug fixes etc.

In practice, I find that if I run the same code multiple times with the same cuDNN and PyTorch version (even on different machines; assuming manual seeds for weight init and shuffling are set and cuDNN is set to deterministic), I always get consistent results.