What is the difference between DataParallel and DistributedDataParallel?


(Jalem Raj Rohit) #1

I am going through this imagenet example: https://github.com/pytorch/examples/blob/master/imagenet/main.py

And, in line 88, the module DistributedDataParallel is used. When I searched for the same in the docs, I haven’t found anything. Possible to redirect me to it if any such doc exist for the module.

Else, would like to know what is the difference between the DataParallel and DistributedDataParallel modules.


(Francisco Massa) #2

DataParallel is for performing training on multiple GPUs, single machine.
DistributedDataParallel is useful when you want to use multiple machines.