With the PyTorch Distributed training model, code changes affect running applications

ckx506772099 · January 19, 2022, 7:53am

I noticed two strange things,
First, when I used Distributed for training, I modified the code. After a while, the application reported an error when RUNNING because of my modification.
Second, I added print to the code, and after a while, the program that was already running will start to print accordingly

wanchaol · February 7, 2023, 4:17am

@ckx506772099 Sorry for the late reply! Please see DDP is affected by code modification, this is a python multiprocessing module behavior instead of DDP.