FreeCudaCallbackRegistry()

dfalbel · October 13, 2020, 11:57pm

It seems that the CUDA allocator in c10 has a callback registry. Would it be possible to add other callbacks to it using the C++ frontend?

Context: the R garbage collector is lazy and will run only when R needs more memory, however R is not aware of CUDA memory and won’t GC even if the GPU memory is used by objects that no longer have references. I’d like then, to add a callback to the CUDA memory allocator so it calls R garbage collector whenever it needs more memory.

Any suggestion is appreciated, thanks very much!!

see: https://github.com/pytorch/pytorch/blob/master/c10/cuda/CUDACachingAllocator.cpp#L617-L624

bool trigger_free_memory_callbacks(AllocParams& p) {
    bool freed_memory = false;
    for (const auto& name : FreeCudaMemoryCallbacksRegistry()->Keys()) {
      freed_memory |=
        FreeCudaMemoryCallbacksRegistry()->Create(name)->Execute();
    }
    return freed_memory;
}

glaringlee · October 16, 2020, 3:44pm

@dfalbel
I never did it before. But I think it is doable.

The FreeCudaMemoryCallbacksRegistry is defined here and used to register FreeMemoryCallbacks
‘https://github.com/pytorch/pytorch/blob/master/c10/cuda/CUDACachingAllocator.cpp#L23’

FreeCudaMemoryCallbacksRegistry is actually a c10::registry that defined here:
‘https://github.com/pytorch/pytorch/blob/master/c10/util/Registry.h#L268’
‘https://github.com/pytorch/pytorch/blob/master/c10/util/Registry.h#L215’
‘https://github.com/pytorch/pytorch/blob/master/c10/util/Registry.h#L268’

FreeMemoryCallback is defined here:
‘https://github.com/pytorch/pytorch/blob/6f396e18c33efbb8547b7ca9d412a316bd51429b/c10/cuda/CUDACachingAllocator.h#L20’

So to add your own free memory callback you can do the following:

Write your own FreeMemoryCallback by inheriting FreeMemoryCallback.
Implement your own execute()
Use FreeCudaMemoryCallbacksRegistry()->register(…) to register your own
see c10::Registry to learn what parameter should be passed in, it basically the
type of your customized memory callback class and a creator func ptr.

Your customized free memory callback will be called in trigger_free_memory_callbacks then.

dfalbel · October 16, 2020, 10:26pm

Thanks @glaringlee ! This is very helpful! Just one follow up question: would you know if the allocators/callbacks are always called from the main thread or there’s no guaranty for this?

glaringlee · October 17, 2020, 3:59am

@dfalbel
I think this depends on where you call the malloc, either in main thread or other threads.
‘https://github.com/pytorch/pytorch/blob/master/c10/cuda/CUDACachingAllocator.cpp#L214’
The memory free callback is triggered whenever you call malloc().