Out of memory while using nn.DataParallel() but fits with one GPU

I have been implementing Deep Autoencoders for recommender systems.
My data consists of users:535000 and items:1551731.

Model’s arichitecture is:
############# AutoEncoder Model: #####################
DeepEncoder(
(en1): Linear(in_features=1551731, out_features=128, bias=True)
(en2): Linear(in_features=128, out_features=192, bias=True)
(en3): Linear(in_features=192, out_features=256, bias=True)
(en4): Linear(in_features=256, out_features=320, bias=True)
(dp1): Dropout(p=0.8)
(su): SELU()
(de1): Linear(in_features=320, out_features=256, bias=True)
(de2): Linear(in_features=256, out_features=192, bias=True)
(de3): Linear(in_features=192, out_features=128, bias=True)
(de4): Linear(in_features=128, out_features=1551731, bias=True)
)
######################################################

If I use:
rencoder=DeepEncoder(1551731)
device=torch.device('cuda:0,1; if torch.cuda.is_available() else ‘cpu’)
rencoder=nn.DataParallel(rencoder)
rencoder.to(device)
I am out of cuda memory/

But with the same model but keeping everything in one gpu works fine.
Help please. I wanted to use both gpu in order to enhace my model architecture.

1 Like

I have the same issue, did you find a fix?