My code has some host memory leak which might be there due to some tensors attached in some custom autograd function while they shouldn’t be attached to the graph. Can I detect such leak by using PyTorch’s profiler? Considering the example listed in the tutorial here: PyTorch Profiler — PyTorch Tutorials 1.12.1+cu102 documentation
profile_memory=True, record_shapes=True) as prof:
would the code above print higher memory usage values across epochs if there’s a leak?
I don’t think checking the profiler output would help in this case, as it would show the memory usage of each operation, which is unrelated to storing tensors attached to a computation graph.
This code snippet should illustrate it:
model = models.resnet18()
x = torch.randn(1, 3, 224, 224)
outputs = 
for i in range(10):
with profile(activities=[ProfilerActivity.CPU], profile_memory=True, record_shapes=True) as prof:
out = model(x)
Also note that you are not seeing a memory leak (which would indicate that memory is lost and cannot be freed anymore), but an expected increase of memory usage since you are explicitly storing tensors including their attached computation graph.
The follow up question is then, can I somehow get total tensor count or something similar so that I can check if it’s growing with each epoch or not?