referece to pytorch profiler, it seem only trace cpu memory instead of gpu memory, is there any tool to trace cuda memory usage for each part of model?
Try GitHub - Stonesjtu/pytorch_memlab: Profiling and inspecting memory in pytorch, though it may be easier to just manually wrap some code blocks and measure usage deltas (of cuda.memory_allocated).
Thanks for your reply, I’ll try it.
Is there a official pytorch profiler for gpu memory?
afaik, it only has torch.profiler.profile(profile_memory=True) as an aggregator, I’m not sure if that produces useful results (there is undocumented autograd.profiler.record_function(“X”) to mark code blocks)…
Thanks! torch.profiler.profile(profile_memory=True) seem only produce cpu memory usage, I might have to find another way
there are options for cuda (version dependent, so check docs)