DataParallel on single GPU

papyrus · June 15, 2021, 11:36am

Hi everyone,

Is it possible to split a GPU in half, and apply DataParallel ? It seems like my model doesn’t use the full GPU capacity, and I’ve read that increasing the batch size (which would give more work to the GPU) changes the learning and could make it worse.

Thanks

bsridatta · June 15, 2021, 8:35pm

To my knowledge, the effect would be the same. Since dataparallel makes the copy of the model on different gpus and merge the results. The backward gradient is also passed across all the gpus and collectively updated. That means, all these copies update at once. So would there be a difference? It shouldnt.

So how to use you gpu more efficiently? run multiple experiments at once!

papyrus · June 16, 2021, 10:48am

Ok, thanks for the answer!