Libtorch uses much more GPU memory than python?

(Riddick Gao) #1

Recently, I am using libtorch to do the prediction. I have saved the original model to .pt model using torch::jit::trace( ) API.

    traced_script_module = torch.jit.trace(base_model, example)
    traced_script_module.save(ouput_path)

Then I loaded it in C++ using torch::jit::load() API and do the model.eval().

    module = torch::jit::load(model_path);
    module->eval()

But I found that libtorch occupied much more GPU memory to do the forward( ) with same image size than original module in python.

So I reloaded the .pt model in python. I found it have the same performance as the original module.

It seems like that C++ uses much more memory than python with same model and same image size.

How can I deal with it? Does any one has the same problem? thanks.

pytorch version: 1.1.0
python version: 3.7.0
OS: windows 10
IDE: VS2015
cuda10 + cudnn10.0 v7.5.1.10