Do we need to call cuda()
for model and data if we use DataParallel
?
Say we have four GPUs, specifically there are three questions:
a. If we do not call cuda()
, the model and data is on CPU, will it be any time inefficiency when it is replicated to 4 GPUs?
b. If we call cuda()
, the model and data is on GPU #1, will it be any space inefficiency in terms of replicate it again on GPU #1, or it won’t be replicate again if the model/data was there?
c. Overall, for time/space efficiency, should we call cuda()
if we use DataParallel
?