The CPU memory just increases as my program running. Sorry that the codes is internally used so I can’t paste it. Here’s some information:
- My program runs in inference mode, I set
torch.no_grad()
context. So it can’t be the computation graph memory “leak”. - The neural networks are small nn.LSTM and nn.Linear models. When I set batch size to small value (like 4 or 8), the memory usage is stable. But when batch size increase to larger (32, 128, …), the memory usage just increases over iterations.
- I use
psutil
to get the memory usage. I also trytracemalloc
and torch profile, but the letter two tools can’t tell where the leak lies. - My program can run on CPU or GPU device. The possible memory leak only occurs when using CPU.
The iteration loop is like
with torch.no_grad():
for minibatch in testloader:
my_function(minibatch)
mt = psutil.Process(os.getpid()).memory_info().rss
print(mt)
I know without the detailed codes it’s hard to find the cause. Any suggestion is welcome .