Can memory be distributed among GPUs?

Consider the memory usage of 4 GPUs while training my models using nn.DataParallel.

We can see that cuda:0 generally acts as the master node and needs more memory. Is there any way to distribute memory uniformly among all the GPUs?

That’s a known limitation of nn.DataParallel which is one reason why we recommend to use DistributedDataParallel besides the better performance of the latter approach.

1 Like