Non-Deterministic result on multi-layer LSTM with dropout

tskatom · November 9, 2017, 6:53am

I got non-deterministic results when I run the RNN model with multi-layers and dropout on GPU. The LSTM layer is defined by following line:
self.enc_rnn = nn.LSTM(input_dim, self.rnn_dim, self.num_layers, bidirectional=True, dropout=self.dropout_p)
I have setup the seed and device with following lines before training:
torch.cuda.set_device(0) np.random.seed(1234) torch.manual_seed(1234) torch.cuda.manual_seed(1234) torch.cuda.manual_seed_all(1234)
I can get consistent result when the self.num_layers=1 and dropout is non-zero, or self.num_layer > 1 and dropout=0. However, the model will output non-deterministic result when self.num_layer > 1 and dropout is non-zero. Do I still miss something to get the deterministic result?

Thanks

smth · November 10, 2017, 12:51am

it’s quite possible the CuDNN LSTM kernels are non-deterministic.