Currently, I am using PyTorch built with CPU only support. When I run inference, somehow information for that input file is stored in cache and memory keeps on increasing for every new unique file used for inference. On the other hand, memory usage does not increase if i use the same file again and again.
Is there a way to clear cache like cuda.empty_cache() in case of CPUs only.
I am serving the model as a flask app and inputs are images. i am loading data using torch dataloader function. How do I free that Dataloader? I thinks its done automatically.