Accuracy difference on multi GPU with nn.DataParallel

@ptrblck,
Thank you for the suggestion but the nn.DataParallel seems quite worse than the counterpart, please see the results of the experiments.