Non-determinisic results


(Joo-Kyung Kim) #1

I run the follow code before definition of modules. (My model uses Embeding, Dropout, LSTM, and Linear layers.)

torch.manual_seed(1000)
torch.backends.cudnn.enabled = False
torch.cuda.manual_seed(1000)

However, the final results are still different for each trial (regardless I use CPU or GPU). Are there any other things I should do to get the deterministic results given the same input?

Thanks.


(Adam Paszke) #2

Are you using multiple GPUs? If yes, you need to use torch.cuda.manual_seed_all.
This should be enough and is enough for getting deterministic results in all examples we have. The problem probably lies somewhere in your code.


(Adam Paszke) #3

For example, if you’re using numpy or random modules, you need to seed them too.


(Joo-Kyung Kim) #4

Thanks! It was because I was using random module.

BTW, is it determinsitic to use cudnn libraries such as SpatialConvolution and SpatialMaxPooling, which can be non-deterministic in lua torch?


(Francisco Massa) #5

@supakjk I think that for the moment one cannot chose which algorithm will be used by cudnn in pytorch, so you can’t assume that it will pick the deterministic algo for SpatialConvolution. Also, SpatialMaxPooling is not deterministic in cudnn.


(Joo-Kyung Kim) #6

@fmassa Can’t torch.backends.cudnn.enabled = False guarantee that CUDNN won’t be used, which means that the conv related operations would be deterministic even if the CUDA versions are used?


(Francisco Massa) #7

@supakjk yes, disabling CUDNN is an option for enforcing determinism


(Edgar Riba) #8

@apaszke why don’t just supress torch.cuda.manual_seed ?


(Adam Paszke) #9

@edgarriba not sure what you mean


(Edgar Riba) #10

@apaszke what’s the use case of having both torch.cuda.manual_seed and torch.cuda.manual_seed_all ?


(Adam Paszke) #11

manual_seed seeds only the current GPU, manual_seed_all seeds all of them. We’re thinking about having torch.manual_seed seed both CPU and all GPUs.


(Edgar Riba) #12

@apaszke nice! it will be very helpful


(Samrat Hasan) #13

If I don’t have any random initialization or anything random in my neural networks, is it going to affected by the torch.cuda.manual_seed()?

I am trying to understand the role of seeding in pytorch. For example, if I have a model trained with a specific seed, can I say, it will produce the same output for a specific input? While with no seed, its not guaranteed to produce the same output?

One thing is also bothering me that if I train without setting any seed, why I would get different output for the same input given that there is no randomness associated with my model?


(Houjing Huang) #14

I found that multi-thread pre-fetching training samples also introduces randomness. In the multi-thread way, in a new run the samples are put into the queue in a new order, determined by the relative speed of the threads. I had to set the number of pre-fetching threads to 1 to solve the problem.

What’s more, if the pre-fetching thread (only having one pre-fetching thread in this case) is not the main thread (i.e. it’s parallel to the main thread) and both threads are using random numbers, then make sure that these two threads use different random value generators, each generator having its own seed. Otherwise, their relative order of accessing the random value generator may differ in different runs. To have separate random value generators, for example, the main thread may set the seed like numpy.random.seed(seed) and use numpy.random.uniform() to generate a random value; The pre-fetching thread creates its own generator with seed prng = numpy.random.RandomState(seed) and generates values like this prng.uniform().

BTW, I implemented the multi-threading in my own way using package threading, not using the official one.