Hi
May I ask what will happen if the batch size is 1 and the dataParallel is used here, will the data still get splited into mini-batches, or nothing will happen?
Best Regards
Hi
May I ask what will happen if the batch size is 1 and the dataParallel is used here, will the data still get splited into mini-batches, or nothing will happen?
Best Regards
Just tested it, if the batch_size is 1 and DataParallel is used, only 1 GPU will be used. If the batch_size is larger than 1, 2 GPU will be used.
May I ask did you test on images or nlp?
Tested on images. And I only have 2 GPUs, for the case of more GPUs the Batch Size
should be larger than the number of the GPUs to make all GPUs work.