How to properly manage tensor memory when using PyTorch C++ API?

Hello

I am experimenting with the PyTorch C++ frontend & noticed some unexpected memory behavior when working with tensors. :slightly_smiling_face: Specifically; when I create tensors inside a loop and pass them to functions; I sometimes see memory usage growing continuously, as if the tensors are not being released. :innocent:

This does not happen in my Python experiments; so I suspect I am missing something about how the C++ API handles reference counting and tensor ownership.:innocent:

I tried using at::Tensor::detach() and also checked whether torch::NoGradGuard could help, but the issue persists in long-running processes. :innocent: It is not always easy to reproduce in small code samples but in larger projects, the memory growth becomes noticeable after running many iterations. :thinking: Checked C++ — PyTorch 2.8 documentation guide for reference. I am also curious if insights from PyTorch C++ memory handling could be applied when learning How To Build AI Agents efficiently.

I want to understand if this is a misuse of the API on my side / if there are specific cleanup steps required in C++ that are handled automatically in Python.:thinking:

Has anyone else run into this? If so; what is the recommended pattern for managing tensor lifetimes safely in C++? :thinking:

Any best practices / examples would be very helpful; since good memory management is crucial for production C++ projects using PyTorch.

Thank you !!:slightly_smiling_face: