All the GPUs are working , but one fills up. what am I doing wrong?

I’m trying to build a segmentation model,and I keep getting “CUDA error: out of memory”,after ivestigating, I realized that all the 4 GPUs are working but one of them is filling.

Some technical details:

  • My Model: the model is written in pytorch and has 3.8M parameters.
  • My Hardware: I have 4 GPUs with 12GRAM (Titan V) each.

I’m trying to understand why one of my GPUs is filling up, and what am I doing wrong.

  • Evidence: as can be seen from the screenshot below, all the GPUs are working, but one of them just keep filling until he gets his limit.

enter image description here

  • Code: I’ll try to explain what I did in the code:First my model:
model = model.cuda()
model = nn.DataParallel(model, device_ids=None)

Second, Inputs and targets:

inputs = inputs.to('cuda')
masks = masks.to('cuda')

Those are the lines that working with the GPUs, if I missed something, and you need anything else, please share.

I’m feeling like I’m missing something so basic, that will affect not only this model but also the models in the future, I’ll be more than happy for some help.

Thanks a lot!

Please someone, pretty desperate over here!

Have a look at @Thomas_Wolf’s blog post about the behavior of nn.DataParallel. You can find some suggestions to use a balanced load on a multi-GPU machine.