I imagined that that the difference between allocated and reserved memory is the following:
- allocated memory is the amount memory that is actually used by PyTorch.
- reserved is the allocated memory plus pre-cached memory.
If that is correct the following should hold:
-
reserved memory
>=
allocated memory -
reserved memory
~=
allocated memory after callingtorch.cuda.empty_cache()
Is my understanding correct?
I’m asking this since I have trouble determining the peak memory requirement for a piece of code. I’ve asked about something similar before (Cc @ptrblck ).
My setup is as follows:
for param in params:
torch.cuda.empty_cache()
torch.cuda.reset_peak_memory_stats()
try:
test_function(param)
except RuntimeError:
break
finally:
print(torch.cuda.memory_summary())
Here is the compressed output:
param |
failing | Allocated memory (Peak Usage) | GPU reserved memory (Peak Usage) |
---|---|---|---|
1 |
False |
15006 MB | 16726 MB |
2 |
False |
17402 MB | 19354 MB |
3 |
False |
19961 MB | 22184 MB |
4 |
True |
20609 MB | 22454 MB |
(Note that for param==4
the memory report was generated after the error was raised and thus does not reflect the actual memory usage for the whole of test_function
)
The memory requirement is growing approx. quadratic. A quick extrapolation for the failing param
(param==4
) gives 22683 MB of allocated memory and 25210 MB of reserved memory. I have 24189 MB available and no other processes are running. Thus, if my understanding about allocated and reserved memory is correct, the case should not fail.
Can someone explain why this is not the case?