Same weight init, learning_rate, data_order but different gradient

I’m doing some experiments and I found that even though I fixed starting weight(I used pre-initialized weight), learning_rate and data order(I set shuffle option to False), the gradient changed when I try this two times! I got gradients of my model throughout 5 epochs and I did this for two times. Then I compared those gradients each other. They get further and further as epoch gets bigger.

Is there any other way that gradients can be changed?


Are you running your model on the GPU? Some op used by cuDNN are non-deterministic (especially backward accumulation with atomicAdd with floating points).

Please find more info about it here

Oh wow thanks for quick replying LOL.
I found out that I have to do all three things in the link to make gradients same.
Thanks so much for the information!!
Have a good one!