CUDA suddenly always runs out of memory even with small models

I’ve been building some models, some larger some smaller, but all of them could run on my laptop’s GPU (GeForce GTX 1650 Ti). I always ran them with a batch size of 32 but once try to increase it to 64 which led to my laptop basically freezing. I had to force it to shutdown. Ever since then, with even the smallest model and any batch size, I get a CUDA out of memory error.

I have done quite some googling and searched this forum a fair bit, and I assume I have some zombie process occupying a lot of memory. However, I cannot even localize it. The nvidia-smi command doesn’t show any processes. Btw I am on Windows, so I cannot use the “killall” or the “nvidia-smi --gpu-reset” commands.

Via PowerShell, I have also inspected active processes using “Get-Process” but couldn’t find anything. That is, when Spyder (which I am using, but running my code via the command prompt leads to similar issues) is closed, there aren’t any Python-related processes (as far as I can tell). torch.cuda.memory_allocated() also indicates that 0 memory is allocated (on start up before) running a model.

After a restart no zombie process would be alive from the previous run which is also confirmed by checking nvidia-smi showing a 0MiB usage.

Are you able to use the GPU at all using any other CUDA application?

1 Like

Oh boy, I guess I googled a bit too much and convinced myself that it must be a zombie process. All the posts about it I read seemed so similar to what I experienced. Turns out, I just created very large latent vectors by accident. Sorry about that.

@ptrblck Thank you for the quick response and in general all your replies on this forum. You have already helped me countless times.

1 Like