DDP
is not only used for multi-node training, but is also speeding up single-node multi-GPU workloads.
The current proposal is to deprecate DataParllel and in this sense to ramp up the documentation on DDP.
DDP
is not only used for multi-node training, but is also speeding up single-node multi-GPU workloads.
The current proposal is to deprecate DataParllel and in this sense to ramp up the documentation on DDP.