Free all GPU memory used in between runs

Deleting all objects and references pointing to objects allocating GPU memory is the right approach and will free the memory. Calling empty_cache() will also clear the cache and free the memory (besides the memory used for the CUDA context).
Here is a small example:

import torch
import torch.nn as nn


def memory_stats():
    print(torch.cuda.memory_allocated()/1024**2)
    print(torch.cuda.memory_cached()/1024**2)


def allocate():
    x = torch.randn(1024*1024, device='cuda')
    memory_stats()
    
    
memory_stats()
# 0.0
# 0.0

allocate()
# 4.0 # allocated inside the function
# 20.0 # used cache

memory_stats()
# 0.0 # local tensor is free
# 20.0 # cache is still alive

torch.cuda.empty_cache()
memory_stats()
# 0.0
# 0.0 # cache is free again

x = torch.randn(1024, 1024, device='cuda')
memory_stats()
# 4.0
# 20.0

# store referece
y = x

del x # this does not free the memory of x since y still points to it
memory_stats() 
# 4.0  
# 20.0

del y # this allows PyTorch to free the memory and reuse it in the cache
memory_stats()
# 0.0
# 20.0

torch.cuda.empty_cache()
memory_stats() 
# 0.0
# 0.0
2 Likes