Runtime memory breakdown (activations, weights, etc.)

I am looking for a way to link cuda memory allocations (which are done by the CUDACachingAllocator, at which point only the requested size is known) to the accompanying usage of memory. Ultimately this would help me get a breakdown of where the GPU memory is being consumed (ie. weights, activations, etc.). I have been looking for relevant metadata in c10/core/TensorImpl.h, c10/core/StorageImpl.h, and other places so that hopefully I could modify the source code and pass this information down to the allocation directly. Is there any way for me to find descriptions of why a certain amount of memory was allocated, or is there another way for me to accomplish my task?

Another place that I looked was in the nn.Module named_parameters(), but this did not give me all the relevant info, for example there is no information about what data structures the activations are using.
Thank you.

1 Like