Hello,
I found the document that PyTorch utilizes the caching memory allocator for fast memory deallocation without device synchronization.
https://pytorch.org/docs/stable/notes/cuda.html
Is it a general concept or PyTorch specified memory management technique?
Are there any detailed documents or information about caching memory allocator?
I want to understand how caching memory allocator works.
Thanks