Difference between allocated and reserved memory

pmeier · June 29, 2020, 9:00am

I imagined that that the difference between allocated and reserved memory is the following:

allocated memory is the amount memory that is actually used by PyTorch.
reserved is the allocated memory plus pre-cached memory.

If that is correct the following should hold:

reserved memory >= allocated memory
reserved memory ~= allocated memory after calling torch.cuda.empty_cache()

Is my understanding correct?

I’m asking this since I have trouble determining the peak memory requirement for a piece of code. I’ve asked about something similar before (Cc @ptrblck ).

My setup is as follows:

for param in params:
    torch.cuda.empty_cache()
    torch.cuda.reset_peak_memory_stats()

    try:
        test_function(param)
    except RuntimeError:
        break
    finally:
        print(torch.cuda.memory_summary())

Here is the compressed output:

`param`	failing	Allocated memory (Peak Usage)	GPU reserved memory (Peak Usage)
`1`	`False`	15006 MB	16726 MB
`2`	`False`	17402 MB	19354 MB
`3`	`False`	19961 MB	22184 MB
`4`	`True`	20609 MB	22454 MB

(Note that for param==4 the memory report was generated after the error was raised and thus does not reflect the actual memory usage for the whole of test_function)

The memory requirement is growing approx. quadratic. A quick extrapolation for the failing param (param==4) gives 22683 MB of allocated memory and 25210 MB of reserved memory. I have 24189 MB available and no other processes are running. Thus, if my understanding about allocated and reserved memory is correct, the case should not fail.

Can someone explain why this is not the case?

Jean_Da_Rolt · December 15, 2020, 1:53am

Hi @pmeier, I had a similar understanding, however I just found out that the statement:

reserved memory ~= allocated memory after calling torch.cuda.empty_cache()

did not hold true. If that is the case, it would be nice to have a more in depth description of the memory management here:

Darius-H · March 6, 2022, 2:50pm

So if we call torch.cuda.empty_cache(), those pre-cache memory can be used by other GPU applications. Then why reserved memory != allocated memory? I’m still confused.

ptrblck · March 7, 2022, 7:09am

reserved memory contains the allocated and cached memory.

ndvbd · February 10, 2023, 8:57pm

What is “cached memory”? Where is the formal definition?

ptrblck · February 11, 2023, 5:14am

Memory stored in a cache, which can be reused without a reallocation.

harsanyidani · September 22, 2023, 8:33pm

I understand this. But what else does it contain? I experienced multiple times that there is a significant difference between reserved and allocated memory after calling torch.cuda.empty_cache(). My undertanding is that they shoud be almost the same after the cache is freed. @ptrblck