How to use multi-GPU to train our model with the newest stable version. Just cast the model to nn.DataParallel? Or plus manually converting BN to nn.SyncBatchNorm?
model = nn.DataParallel(model)
model = nn.SyncBatchNorm.convert_sync_batchnorm(model) (Is this necessary?)
Moreover, do we need to set the batch-size to nGPUs times the single GPU batch-size?
Another thing weird is, when I used model = nn.DataParallel(model) without change the batch-size, it automatically distributed to 4 GPUs, with total memory-in-use about 2 times of memory in the single-GPU situation. And the training result is also a little different.
One thing I found more efficient is
model = nn.SyncBatchNorm.convert_sync_batchnorm(model)
is not necessary when I have
model = nn.DataParallel(model)
at least in pytorch1.1.0.
So this issue is changed to how to handle batch-size with nn.DataParallel, and why the result using nn.DataParallel is different.