How to free GPU memory (Nothing works)

Hi @smth , I tried all the discussion and everywhere but can’t find the correct solution with pytorch. I am seeking your help.

How can I free up the memory of my GPU ?

[time 1] used_gpu_memory = 10 MB
[time 2] model = ResNet(Bottleneck, [3, 3, 3, 3],100).cuda()
[time 2] used_gpu_memory = 889 MB
[time 3] del model
[time 4] torch.cuda.empty_cache()
[time 4] used_gpu_memory = 627 MB

I tried gc.collect(). It is also not helping. I am having huge problem during training due to this un-referenced memories.

1 Like

Have you tried to terminate your script and remount the GPUs?

Yeah. When I terminate the script it frees the memory but the same thing again when I run it. @smth any thought on that? I want to free that redundant memory when I am running the script so @PTA remounting the gpu during the run does not make sense at all.

I think after torch.cuda.empty_cache() there is a reduced number of remaining memory does make sense. From my experience that should not affect your model training, unless there is a memory leak.

That memory is occupied by something. I want to remove that memory as i want to use that memory. May be trying to put some new tensor in the gpu but that space is still occupied.

@smth Still waiting for your response.

@KnHuq we do occupy some base memory of ~400MB per GPU for cuda context, CUDA RNG context, streams etc. and ~200MB per GPU for CuDNN handles etc.
It depends on GPU model as well. The numbers I quoted are for Volta V100 GPU.

1 Like

if I do ops like:

x = conv(x)
x = conv(x)

I found it would save the very first x even if it has been replaced by conv(x). And it would consume 3 times memory as much as x. Could I free the useless very first x like that?

Double post from here. Lets continue the discussion in the other topic. :wink: