I am using the Pytorch memory profiling functions to analyse my model. I can see all memory allocations using the export_chrome_trace function of torch.profiler. I was wondering whether this contains all memory transactions or does Pytorch use the same allocation more than once?
PyTorch uses a caching allocator and is thus reusing the already allocated memory if possible. If you are interested in the actual memory reads and writes on a specific CUDA kernel you could profile it as described e.g. here.
Thank you for your response. Is there also such a tool for actual memory reads and writes when using only an intel CPU?