Memory_allocated() vs memory_cached()


From PyTorch documentation, remains unclear to me which is the difference between the two commands: it seems to me that cached memory is a superset of allocated memory. Could you give me a more detailed explanation on this?


The allocated memory is the memory that is currently used to store Tensors on the GPU.
The cached memory is the memory that is currently used on the GPU by pytorch (as can be seen in nvidia-smi).

Hi albanD, I am still confused. According to your explanation, cached should include allocated (as the Tensors are used by pytorch). However, I encountered the error “Tried to allocate 5.57 GiB (GPU 0; 15.77 GiB total capacity; 5.83 GiB already allocated; 3.29 GiB free; 5.51 GiB cached)”. It seems that the allocated is larger than cached. Besides, Even though there is only my process running on “cuda:0”, the “GPU memory usage” is always inconsistent with cached. Could you give me a help on how to estimate the gpu memory could be used further? I am currently using free + (cached - allocated).


This is quite an old comment.
You can find up to date informations on the doc here.