By using torch.nn.DataParallel
, the cnn network trained with 2 gpus (batch size = 4) and the same network trained with 4 gpus (batch size = 8), do they have the same performance? (this network has batch normalization layers)
By using torch.nn.DataParallel
, the cnn network trained with 2 gpus (batch size = 4) and the same network trained with 4 gpus (batch size = 8), do they have the same performance? (this network has batch normalization layers)