Run `torch.distributed.launch` with `-m` option for the script

Hi I’m using DistributedDataParallel to run my model across multi-GPU with sync BN. However, my script uses relative imports and is supposed to be run with -m option. How can I do this when launching it via torch.distributed.launch?

Example (does not work, but I’d like to do this):
python -m torch.distributed.launch --nproc_per_node 2 -m detector.train --arg1 --arg2

Thanks

Take a look at this snippet, it could help.

Thanks for your reply but I think you misunderstood. The issue is not running torch.distributed.launch with -m option. The problem is that my script uses relative imports and it is supposed to be run with -m option. I reckon that when torch.distributed.launch spawns the script it uses the more natural approach python detector/script.py, whereas I’d like it to call like python -m detector.script

You can create a copy of this file and customize it the way you want. Here this module will spawn parallel processes according to their rank. You could arrange your script so that the cmd could look like cmd = python -m detector.script --local_rank --arg1 --arg2 ....

1 Like

It is unfortunate that I have to make a copy and alter it but I guess it works! Thanks a lot =]

1 Like