The documentation states:
[I]n order to make computations deterministic on your specific problem on one specific platform and PyTorch release, there are a couple of steps to take. […] A number of operations have backwards that use
atomicAdd
, in particular […] many forms of pooling, padding, and sampling. There currently is no simple way of avoiding non-determinism in these functions.
Does this mean if I follow the guidelines, I will get deterministic results between individual runs, given the soft- and hardware does not change? Or does “no simple way” mean “no currently implemented way”?
I’m asking, because I do the following
def make_reproducible(seed=0):
random.seed(seed)
np.random.seed(seed)
torch.manual_seed(seed)
torch.backends.cudnn.deterministic = True
torch.backends.cudnn.benchmark = False
make_reproducible()
cnn_training_loop()
and the loss deviates between runs after a few iterations. This deviation is at first only present on the 4th or 5th significant digit of the loss, but builds up over time. Since it is only this small at first, I suspect that it stems from some non-deterministic behavior rather than a bug in my implementation.