Multi-GPU get a better performance?

yq_insa · April 6, 2018, 8:25am

Hi,

I am running the same training code with and without the line:
model = nn.DataParallel(model, device_id=[0,1])

I am surprised that the training with multi-gpu get a much better result.
I thought that was something random in training procedure. So I repeated several times the singe/multi-GPU training. The multi-gpu training always get a result 2% better than the single-gpu training. 2% is not a negligible difference in my task.

I am using Resnet50 with my own data. I replaced the last fc layer. And dropout is used.

So I want to know how to explain this? whether there are some operations working differently in multi-gpu mode?

Thank you

gmyofustc · April 7, 2018, 7:24am

the difference is the batch_size, when you uyse nn.DataParralle()， the batch size is multiple times as your single gpu’s one

Mata_Fu · October 13, 2018, 3:56pm

So acturally there’s no difference on the performance, right?