How to effectively release a Tensor in Pytorch?

Hi, All

How could we release a Pytorch Tensor when we decide not to use such a tensor anymore? In both python API and C++ API of Pytorch.

Thanks!

In Python del tensor will work, in libtorch you could use e.g. this approach.

2 Likes

Please note in libtorch for tensors on the GPU you may have to call c10::cuda::CUDACachingAllocator::empty_cache() once the tensor goes out of scope if you want pytorch to release this memory. Otherwise, it still frees the memory once tensor goes out of scope but still keeps that memory available in libtorch’s cache. If you want to free that memory for other processes then you must call empty_cache()

1 Like

Hi,

I have one more question is that. Assuming I create a customized Pytorch API that will create a tensor inside the C++ function during the execution. For example.

A = create_a_CUDA_tensor_via_customized_CPP_function();

inside the create_a_CUDA_tensor_via_customized_CPP_function(); I create and return a tensor like torch::ones().cuda().

if later on at some point, I want to assign A with another tensor.

A = Another_create_a_CUDA_tensor_via_customized_CPP_function();

Do I need to del A before I am calling the another function to avoid the GPU memory leak?

Thanks!

Hey Daniel, just seeing this.

If you are allocating memory on the GPU in a custom function, you will want to call empty_cache() right after the function to clean up any memory from the orignal tensor constructed within the function. As long as you clean up after each function call, you do not need to worry about deleting any memory. By calling empty_cache() after the reassignment of A, you clean up any unused memory in cache. Note that if you do not do this, it is not a memory leak. Emptying the cache only frees the memory pytorch is using. In reality pytorch is freeing the memory without you having to call empty_cache(), it just hold on to it in cache to be able to perform subsequent operations on the GPU easily. You only want to call empty_cache if you want to free the GPU memory for other processes to use (other models, programs, etc)