Thank you for the tip it solved my problem! Yes with torch.set_deterministic(True) I’ve got the following error:
RuntimeError: Deterministic behavior was enabled with either torch.set_deterministic(True) or at::Context::setDeterministic(true), but this operation is not deterministic because it uses CuBLAS and you have CUDA >= 10.2. To enable deterministic behavior in this case, you must set an environment variable before running your PyTorch application: CUBLAS_WORKSPACE_CONFIG=:4096:8 or CUBLAS_WORKSPACE_CONFIG=:16:8. For more information, go to https://docs.nvidia.com/cuda/cublas/index.html#cublasApi_reproducibility
So as it turned out the CUBLAS_WORKSPACE_CONFIG=:16:8 or CUBLAS_WORKSPACE_CONFIG=:4096:2 environment variable settings can solve the problem, and make the training reproducible if you have CUDA >=10.2 version.
My internal MX130 probably used an older CUDA version which was not shown in the nvidia-smi, and probably that’s why it worked in that case, but not worked with the eGPU.
PS.: some pytorch layers can’t even work deterministically for example: nn.AdaptiveAvgPool1d() . But in my case I just had to remove the torch.set_deterministic(True) setting and there was no error and the particular layer had no effect on the deterministic results.