Correct GPU memory allocation


I have a server equipped with 3 Quadro P2000 and one GTX 1050. Quadros have 5GB of video memory and GTX has 4GB. I use DataParallel for multi gpu processing. But i notised, that batch size limited to 4gb and when i try to increase batch size I catch OOM on GTX.
For example if i use batch size 50 GTX memory is full, but each Quadro use only 4GBs instead of 5GB. So I got 3GB of unused GPU Ram in total. And when I increase batch size to 56 I catch OOM beacuse the GTX memory is full.

So the question is: What is the correct way to use all available gpu memory

I use this code to parallalize training

net = model()
    if torch.cuda.device_count() > 1:
        net = torch.nn.DataParallel(net, device_ids=list(range(torch.cuda.device_count())))'cuda')

Ps. GTX index is 0

Thank you for your support !!!

I believe DataParallel assumes identical or very similar GPUs give the _check_balance warning here:
However, you may want to check that using the 1050 is beneficial from a load-balancing perspective to begin with as it may be bottlenecking training compared to the P2000s even when accounting for the difference in available memory.

Seems like Distributed Data Parallel is the solution. Thanks a lot!!! I’ll try this

Thank you for your reply! It looks like regular dataParallel is not the best way for multi gpu training. Thank you!