Performance with multi-gpu

By using torch.nn.DataParallel, the cnn network trained with 2 gpus (batch size = 4) and the same network trained with 4 gpus (batch size = 8), do they have the same performance? (this network has batch normalization layers)