Pytorch running out of memory during inference

paegodu · May 19, 2025, 2:06pm

Hi all,

I am running a third-party model in inference mode as a calculator for ASE to optimize chemical structures. The issue is that after 66 our of ~200 structures PyTorch runs out of memory. I have been trying to track what happens by using torch.cuda.memory_allocated, torch.cuda.memory_reserved and torch.cuda.max_memory_reserved.

The weird thing is that while those show expected values of <1GB of memory being used, after 66 iterations I get the following error:

torch.OutOfMemoryError: CUDA out of memory. Tried to allocate 496.00 MiB. GPU 0 has a total capacity of 79.10 GiB of which 48.75 MiB is free. Including non-PyTorch memory, this process has 79.04 GiB m emory in use. Of the allocated memory 77.16 GiB is allocated by PyTorch, and 928.93 MiB is reserved by PyTorch but unallocated. If reserved but unallocated memory is large try setting PYTORCH_CUDA_ALL OC_CONF=expandable_segments:True to avoid fragmentation. See documentation for Memory Management (CUDA semantics — PyTorch 2.7 documentation)

The main loop is inside of a “with torch.no_grad():”, only one instance of calculator (model) is used and torch.cuda.empty_cache() and torch.cuda.reset_peak_memory_stats() are called after every iteration (without calling the latter function torch.cuda.max_reserved_memory does scale and show an increasing value on par to what I would expect before the error message).

I am at a loss, where is that memory going? How do I release it?

Thanks!

ptrblck · May 19, 2025, 4:56pm

The error shows 79GB are used so your manual memory checks might not be fine-grained enough and you could check if the model internally allocates this large amount of memory.