Out of memory when calling model.cuda()

I have a model that is about 41 million parameters.

When I try to call model.cuda(), it fails with cuda runtime error (2): out of memory.

  File "/home/anaconda3/lib/python3.6/site-packages/torch/_utils.py", line 69, in _cuda
    return new_type(self.size()).copy_(self, async)
RuntimeError: cuda runtime error (2) : out of memory at /opt/conda/conda-bld/pytorch_1512386481460/work/torch/lib/THC/generic/THCStorage.cu:58

The GPU has 12 GB and it not utilized. If I call another model (even bigger), it works fine.

>>> nn.Linear(100, 400).cuda()
Linear(in_features=100, out_features=400)
>>> nn.Linear(400, 400).cuda()
Linear(in_features=400, out_features=400)
>>> a=torchvision.models.resnet50().cuda()
(long output...)
>>> mymodel.cuda() #smaller than resnet50
--- errors ---

What could be the problem? How can a model make the .cuda() call fail with out of memory error?

In what you do above, you try to store both the resnet and your custom model on the gpu. You kept the cuda resnet in the a variable).

sure, but that was just an example. The error is raised even if mine is the only model I load.

I guess your model is then bigger than you think?
Could you show us the model you’re using (at least the init function)?

Ah, you’re right; I had a wrongly set parameter that made the whole model explode. Thank you, and sorry for the waste of time! :slight_smile: