I found in pytorch source code that some operations are non-deterministic because they use atomicAdd, like this one pytorch/NLLLoss2d.cu at 3b78c5682b483086f66d875749f94b7551072a05 · pytorch/pytorch (github.com)
atomicAdd
Does anyone know why atomicAdd can cause non-deterministic. Thanks!
The order of summands is not known and thus causes the non-deterministic behavior. While a simple addition is deterministic: y = x1 + x2, it’s not if more summands can change their order of execution: y = x1 + x2 + x3 + x4.
y = x1 + x2
y = x1 + x2 + x3 + x4