How to Unload model in C++?

a_bear_paw · January 23, 2021, 10:08am

Hi,
I need to unload model which loaded by torch::jit::load() with C++. But I couldn’t find anything about it just like ‘torch::jit::unload()’.

In my C++ code, there are two functions and provided api using pybind11:

#include <iostream>
#include <vector>
#include <string>

#include <torch/torch.h>
#include <torch/script.h>

#include <opencv2/core.hpp>
#include <opencv2/imgcodecs.hpp>
#include <opencv2/highgui/highgui.hpp>
#include <opencv2/imgproc.hpp>
...
torch::jit::script::Module initialize(const std::string &fpath) {
    torch::jit::script::Module model = torch::jit::load(fpath, "cuda");
    model.to("cuda:0");
    return model;
}


std::tuple<at::Tensor, at::Tensor> do_inference(torch::jit::scriot::Module model, ...) {
    ...  // pre-processing
    at::Tensor out = model.forward(input).toTensor();
    ...  // post processing
}

PYBIND11_MODULE(...) {
    m.def("initialize", &initialize, ...)
    ...
    m.def("do_inference",  &do_inference, ...)
}

Build it to foo.so, and import this module in Python3.6, python is a web service, just like:

import foo
from flask import Flask  # a web service
...

model = foo.initialize("./resnet50.pt")

img = open("./kitty.jpg", "rb")
foo.doInference(img)

# This service is running

The Web service is working fine, but how to unload model from device(GPU or CPU), If I want to update the model without stopping the service.
Is there a solution for that out there?
Thanks!

ptrblck · January 23, 2021, 8:53pm

Could you just del model and clear the cache?
I don’t see what any modules, references etc. are stored in the C++ code, so I assume the objects can be freed in your Python script.

a_bear_paw · January 24, 2021, 6:25am

Yes, I want to del model and clear the cache, but NOT shutdown the Python process.
In fact, C++ code is used for inference(including pre-processing and post-processing), The Python process(which is running as a service) load model and inference by calling the C++ dynamic library(may have multiple, one dynamic library, one model).
So, Is there a way to delete loaded model without stopping the Python process?
Thanks!

ptrblck · January 24, 2021, 8:29am

Yes, you can delete the model by running del model in the Python script without shutting it down.

a_bear_paw · January 24, 2021, 12:46pm

I’m trying to use del model in the Python process, but it seem doesn’t work.
when I using nvidia-smi, the model process is still here and the GPU memory not be released.
It seems that the model is loaded in C++ context and must be released in the C++, right?

ptrblck · January 24, 2021, 10:23pm

The memory might still be in the cache and thus not released. You could run torch.cuda.empty_cache() to get it back. I’m not 100% sure, but don’t think you need to release it in the C++ backend, since you are holding a reference to it in Python.

a_bear_paw · January 25, 2021, 8:04am

I use torch.cuda.memory_summary("cuda") before and after the del model, I found the “Allocated memory” to be released after del model, It seems worked. Thanks.