Hi!
I am trying to achieve deterministic behavior on the same GPU and same Docker image across different compute instance for the same torchscript file. However, the issue is that I am getting different results depending on the server instance? Note that the Docker image and GPU is the same and the underlying OS is the same. I have turned off all possible sources of determinism:
at::globalContext().setDeterministicAlgorithms(true);
at::globalContext().setDeterministicCuDNN(true);
at::globalContext().setBenchmarkCuDNN(false);
at::manual_seed(0);
torch::jit::setGraphExecutorOptimize(false);
and even all runtime optimizations. I am getting consistent results on the same compute instance but different results on different computer instances. Note that the docker image and CUDA hardware is the same.
Thanks