you are so right!!!
distributed turned off the multithreading and the c++ function was too late when it asks for the max nbr threads which will be 1.
β begin side question
side question: how to ask ddp to log into a specific file?
the dd logging is in terminal which makes catching warnings difficult.
there are a lot of messages but they are lost.
is there a way to ask ddp to write logs in a specific file?
or can we properly manipulate the instance of log here to do that? thanks
the first time i used ddp, it starts throwing logs in terminal which is unpractical for debug.
there should be a way to tell ddp to log into a file.
the doc1 and doc2 do not seem to cover this.
i didnt investigate further as other things have more priority.
thanks
βend side question
so, now i configure export OMP_NUM_THREADS=32
before running and the runtime is back to 70ms with ddp+2gpus!!! this is cool!
thank you very much! this is a life saver!
also, i was getting this warning which is impossible to read because it is printed on terminal!
*****************************************
Setting OMP_NUM_THREADS environment variable for each process to
be 1 in default, to avoid your system being overloaded, please further tune
the variable for optimal performance in your application as needed.
*****************************************
which explains everything!
also, i am using distributed.launch
which is explains why i am getting this warning
The module torch.distributed.launch is deprecated and going to be removed
in future.Migrate to torch.distributed.run
in the examples they provided, they use launch
. i should probably switch to run
, it seems more up to date.
also, they used launch in this under Launch utility
. probably the doc needs to be updated probably in the next release. i am using pytorch 1.9.0.
again, thank you so much! this was very helpful!