Why does "numpy.random.rand " produce the same values in different cores?

Hi guys, I’m asking a question about using multipleprocessing module to print random numbers.

When I was using the “numpy.random.rand” produce random numbers, I found that some of the produced values from different cores are the same. But the random.uniform can work just ok.

Here is the code.

import multiprocessing
import numpy as np
import random

def print_map(_):
    print(np.random.rand(1))
    # print (random.uniform(0, 1))
num_data = 8
cores = 8
pool = multiprocessing.Pool(processes=cores)
print_random = pool.map(print_map, np.arange(num_data))

The “random” module produces
0.608248187123
0.624808314139
0.578712513812
0.184301478758
0.297702915307
0.886168638762
0.236475271032
0.302152315241

But the “numpy” module produces
[ 0.13633318]
[ 0.13633318]
[ 0.13633318]
[ 0.21356165]
[ 0.13633318]
[ 0.13633318]
[ 0.13633318]
[ 0.13633318]

I really want to know how this could happen.
Thanks a lot !

Presumably numpy duplicates its random number generator once per process, if so then each process uses the same random seed. At a guess the fourth result is different because that thread’s setup stage was delayed.

If you really want different random numbers every time, you could call numpy.random.seed(random.randint(some_large_integer)) in print_map

Thanks ! In your way the “numpy” module can produce different numbers.

But I still couldn’t find where the random number generator was duplicated. Actually the data I produced several days ago using “numpy” seemed to have no such problems.

I don’t know if this has a relationship with the recent update of my Ubuntu. T_T

I can only guess that maybe a recent update of numpy changed the handling of the random state when running multiprocess stuff.

Yeah … Thanks again !

See this comment: Does __getitem__ of dataloader reset random seed?

2 Likes

Yeah! That really helps a lot! Thank you very much!!

If we use random rather than np.random, everything will be ok without setting seeds every multi-process?

I think so. You can have a try like this:
The main reason is resulted from the fact that no seed is input to the numpy.random.seed(). Providing a seed for each process can also be a solution.

1 Like