Correct GPU memory allocation

dr.interned · November 30, 2021, 1:07pm

Hello

I have a server equipped with 3 Quadro P2000 and one GTX 1050. Quadros have 5GB of video memory and GTX has 4GB. I use DataParallel for multi gpu processing. But i notised, that batch size limited to 4gb and when i try to increase batch size I catch OOM on GTX.
For example if i use batch size 50 GTX memory is full, but each Quadro use only 4GBs instead of 5GB. So I got 3GB of unused GPU Ram in total. And when I increase batch size to 56 I catch OOM beacuse the GTX memory is full.

So the question is: What is the correct way to use all available gpu memory

I use this code to parallalize training

net = model()
    if torch.cuda.device_count() > 1:
        net = torch.nn.DataParallel(net, device_ids=list(range(torch.cuda.device_count()))) 
    net.to('cuda')

Ps. GTX index is 0

Thank you for your support !!!

eqy · November 30, 2021, 10:50pm

I believe DataParallel assumes identical or very similar GPUs give the _check_balance warning here:
torch.nn.parallel.data_parallel — PyTorch 1.10.0 documentation

However, you may want to check that using the 1050 is beneficial from a load-balancing perspective to begin with as it may be bottlenecking training compared to the P2000s even when accounting for the difference in available memory.

my3bikaht · December 1, 2021, 8:26am

Also take a look at this post:

dr.interned · December 2, 2021, 5:00pm

Seems like Distributed Data Parallel is the solution. Thanks a lot!!! I’ll try this

dr.interned · December 2, 2021, 5:02pm

Thank you for your reply! It looks like regular dataParallel is not the best way for multi gpu training. Thank you!