Unable to use multiple GPU's with DataParallel

I have wrapped my model with DataParallel to use both of my RTX2080Ti’s but I can only see 1 firing up and out of memory as I start the training.

I’m using num_workers=0 (due to an error LMDB is giving me when I have num_workers>0)

Would this be the issue or is there anything else I’m missing?

def train(hierarchy, epochs, lr, batch_size):
    
    if hierarchy=='top':
        model=PixelSnail(
            [32,32],512,512,5,4,5,512,dropout=0.1,n_out_res_stack=5,
        )
        if torch.cuda.device_count()>1:
            print('using:',torch.cuda.device_count(),'gpus')
            model=nn.DataParallel(model)
            model.to(device)

(The print there is giving me 2 gpus.)

nn.DataParallel might create an imbalanced memory usage as described here.
This could yield an out of memory issue on one device, which would stop the script execution.
You could lower the batch size (if it’s not already 1 per device) and rerun the script to check the device usage.