If I have a training script which works well for multi-GPU training . What changes in the training script I should make to convert into multi-node training. (Aware about changes in launch command)
I have used local rank in my training script, should this be changed for multi-node training ?
model= DDP(model, device_ids=[local_rank], output_device=local).
Is this correct for multi-gPU training and what should be changed for multi-node training ?