Multiple but different gpus

Hey guys ,

is it in general possible to use the data.parallel wrapper if you got two different gpus

thanks Tobi

1 Like

Yes, that’s possible.
However, you will get a warning, if there is an imbalance in the GPU memory (one has less memory than the other).
Also, your performance should depend on the slowest GPU you are using, so it might not be recommended, if you are using GPUs with a very different performance profile.

3 Likes

thanks for the reply,

I got another question it seems like that the two gpus get the same distributions of the data , if the dataset is to large then i got a cuda out off memory error, but for small dataset this is no problem . Do you have any idea how to fix this?

best, Tobi

Have a look at this blog post to see how nn.DataParallel works internally and how to counter some effects of an imbalanced memory usage.

PS: you could also try out nn.DistributedDataParallel, which shouldn’t introduce the imbalance.

Apologies for the old thread resurrection here.

If I have a single machine with 2 mismatched GPUs (1 fast with big memory, 1 slow with small memory), DataParallel will only go as fast as the slow GPU with less memory, making it not worth using - as was covered above.

Does DistributedDataParallel on a single machine with 2 mismatched GPUs make sense? Or am I better off simply using the single fast GPU?

I’m trying to understand the usages of DistributedDataParallel (SPSD, SPMD, etc.)

I’ve been trying to read this and getting a bit confused: [POLL][RFC] Can we retire Single-Process Multi-Device Mode from DistributedDataParallel? · Issue #47012 · pytorch/pytorch · GitHub

Both parallel approaches would suffer from the slow GPU, so you would have to check the actual performance using the fast GPU only vs. DistributedDataParallel.

1 Like