How do I get the --nproc_per_node=32 value from my python script?

How do I get the flag within my python script that I am passing to torchrun ? I want to set the number --nproc_per_node=32 I am passing there automatically rather than making sure the two scripts match (note I want to set the world size myself e.g. I am using cpu parallel jobs and want to choose that value myself thus)

related: Multiprocessing failed with Torch.distributed.launch module - #28 by Brando_Miranda

1 Like