Hi I’m using
DistributedDataParallel to run my model across multi-GPU with sync BN. However, my script uses relative imports and is supposed to be run with
-m option. How can I do this when launching it via
Example (does not work, but I’d like to do this):
python -m torch.distributed.launch --nproc_per_node 2 -m detector.train --arg1 --arg2
Take a look at this snippet, it could help.
Thanks for your reply but I think you misunderstood. The issue is not running
-m option. The problem is that my script uses relative imports and it is supposed to be run with
-m option. I reckon that when
torch.distributed.launch spawns the script it uses the more natural approach
python detector/script.py, whereas I’d like it to call like
python -m detector.script
You can create a copy of this file and customize it the way you want. Here this module will spawn parallel processes according to their rank. You could arrange your script so that the
cmd could look like
cmd = python -m detector.script --local_rank --arg1 --arg2 ....
It is unfortunate that I have to make a copy and alter it but I guess it works! Thanks a lot =]