How to set OMP_NUM_THREADS for distruted training?

I got a warning but there was no link or suggestions of how to tune this number (or what it means). How do I choose this value?

Error:

(meta_learning_a100) [miranda9@hal-dgx diversity-for-predictive-success-of-meta-learning]$ python -m torch.distributed.launch --nproc_per_node=2 ~/ultimate-utils/tutorials_for_myself/my_l2l/dist_maml_l2l_from_seba.py

/home/miranda9/miniconda3/envs/meta_learning_a100/lib/python3.9/site-packages/torch/distributed/launch.py:178: FutureWarning: The module torch.distributed.launch is deprecated
and will be removed in future. Use torch.distributed.run.
Note that --use_env is set by default in torch.distributed.run.
If your script expects `--local_rank` argument to be set, please
change it to read from `os.environ['LOCAL_RANK']` instead. See 
https://pytorch.org/docs/stable/distributed.html#launch-utility for 
further instructions

  warnings.warn(
WARNING:torch.distributed.run:*****************************************
Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed. 
*****************************************
6 Likes

Since it’s an environment variable, I think you can simply set its value by:
OMP_NUM_THREADS=$VALUE python -m torch.distributed.launch --nproc_per_node=2 xxxxx
This is similar to other environment variables e.g. CUDA_VISIBLE_DEVICES

oh cool. But I was curious about how does one choose the value of OMP_NUM_THREADS=$VALUE

5 Likes

bro,did you have slove this problem?

It seems that an optimal value can be found by OMP_NUM_THREADS = nb_cpu_threads / nproc_per_node. Use htop to know about number of CPU threads