Heap size increase constantly when inference with new thread

Heap size increases constantly when i tried to run the extract function in a new thread with the same instance of torch jit script module. I noticed that new heap allocated after the forward call. If commented out the forward call, no heap would be allocated. I am using CPU only with pytorch version 1.2.0. No extra heap allocated if running on single thread.
Am i doing something wrong?

A new thread is created with the code below


std::thread th1(extract);

} while (true);

(No experience loading models in C++ yet, so could be way off, but in case it takes a while for more knowledgeable people to reply)
Have you tried ensuring your module is in eval mode. Looks like in C++ you can do module.eval(); like in python. That behaviour would be consistent with forward creating a graph for backwards that never gets destroyed.
It might also be that putting your module in eval mode in Python before exporting will carry over to C++, but not sure of that, think I saw something to that effect though.

Yes. module.eval() has been call prior to forward. Besides that, torch::NoGradGuard guard; is used and requires_grad(false) has also set to tensor

Found similar problem but seems like there is no answer yet https://github.com/pytorch/pytorch/issues/24237