Non-deterministic on GPU

Hi, I got non-deterministic results from my code when I run several times on GPU.
I have ever print some loss and there are some minor difference at the beginning and it become totally different at the end.
I am wondering is there any implementation error in my code? I use torchtext for processing the data. I set random seed for torch and torch.cuda, I run on one GPU. random seed also set for numpy and rondom module.
And I have set model.train() when training and model.eval() when testing.
I notice that some people said that it is because of the randomness of the GPU. But there are 1% to 2% difference on my results which is really a big gap between them.
The following is my model code which is simply a conv for text classification.


and this the train.py
https://github.com/Impavidity/kim_cnn/blob/master/train.py

Can someone help out of here? Thanks.

1 Like

CuDNN convolution code is non-deterministic, the deterministic version is slower.

See these bug reports on Torch and Tensorflow. I saw the same question in Nvidia forums too.

1 Like