The huge gap of training time between MacOS and Ubuntu 16.04LTS in multiprocessing

I’ve implemented a popular reinforcement learning algorithm A3C, here is the source code:

Sorry for lack of insufficient documentation, but I think it’s enough to reproduce this problem. The problem is when I use my Macbook Pro 15’(2017) to run, it only takes 20 seconds for full training. However, when I run the same code on my Ubuntu 16.04 LTS computer (i7-7700K) without any modification, it needs to take more than 10 minutes to finish training. Even if I modify it to GPU version, it takes 1 minute to training.

What causes such huge gap in training time?

Python version:

  • Python version: 3.5.2

I use Unix command time to analyze the execution time.

  1. Macbook Pro:
    real 0m19.165s
    user 2m14.593s
    sys 0m1.469s

  2. Ubuntu:
    real 2m29.506s
    user 7m28.857s
    sys 11m54.686s

It seems to take much time in sys. on Ubuntu. I finally add os.environ["OMP_NUM_THREADS"] = "1" to solve this issue, which was found at this issue: What is the purpose of `os.environ[‘OMP_NUM_THREADS’] = ‘1’

After I set OMP_NUM_THREADS to 1, the execution time of Ubuntu 16.04 LTS is:

real 0m11.914s
user 1m21.437s
sys 0m0.654s
1 Like