Hi all,
I was wondering if there’s any way we can visualize GPU cache occupancy while training / inferring a model in pytorch. I came across some commands such as torch.cuda.memory._record_memory_history(max_entries=100000)
, but I am assuming this will not partition the memory into global and shared (or cache). I am interested to know if there are any ways to understand fine-grained GPU memory allocation. Thank you!