About GPU Memory Usage

Hello,

I am training a model using 1 of my 2 GPUs and wanted to ask something about the mechanics of GPU memory usage of PyTorch.

After I trained the model and evaluated and got results, there is finally no continuing process inside my ipython notebook. However, when I check the GPU memory, I see that a huge chunk of memory is being used off of the GPU.

Why is happening? The same thing sometimes happen in Google Colab where I have to find some process and kill it manually. Why wouldn’t it let go of my GPU memory so I can quickly try some other experiment and see the results of some other configuration?

(Also what do you recommend I do in this situation, how do I solve this and do quick experiments one after another when I am done with evaluation/training?)
Thank you very much in advance for your help.

Are you restarting the Python kernel inside the notebook or are you running all cells and “wait” in the last cell?
In the latter case, the Python kernel would be still alive and you would be able to write new code in the next cell and execute it. PyTorch uses a caching allocator, which tries to reuse the memory instead of reallocating and freeing it (which would be slow), which could explain the memory usage in case the kernel is still alive.

I guess I got the point.

I am not restarting the Python kernel inside the notebook (I ran all the cells and wait for next commands in the next cell), and I guess it is because of the caching allocator.

I just thought that I would be able to train the model again in the same kernel without any constraints. However in this way, I need to restart kernel and rerun all of my code above the training part.

Thank you so much for the answer, this clears my mind about the notebook kernel

You should be able to reuse the memory, which is in the cache, unless all memory is really used by e.g. tensors. If you are running out of memory in the next cell, your code might be (accidentally) storing some outputs, the computation graphs etc.

Oh I see,

So a really structured and well designed code in a particular way would allow me to reuse the memory (In this case I guess I am storing some kind of computation)

Thanks!