Reducing PyTorch C++ API Library Size for Inference-Only Use

fgauthier · May 23, 2025, 2:50pm

I’ve previously trained a model using the Python API and successfully exported it using torch.jit.trace. Now, I’m deploying this model in C++ and only need the inference functionality.

Here’s a simplified version of the code I’m using:

torch::Tensor input_tensor = preprocess(input.image());
std::vector<torch::jit::IValue> inputs{input_tensor};
_model.eval();
torch::Tensor output_tensor = _model.forward(inputs).toTensor().to(torch::kCPU);
int classification_score = output_tensor[0].item<int>();
cv::Mat segmentation = interpretSegmentation(output_tensor, input.image());

In my current CMakeLists.txt, I’m including all the Torch libraries, which results in a build size of nearly 2.5 GB. I’m looking for a way to reduce this, similar to how OpenCV allows selective inclusion of modules, e.g.:

 FIND_PACKAGE( OpenCV REQUIRED core imgproc)

Is there a way to include only the minimal necessary components of LibTorch for inference ?

Thanks in advance!

zhxchen17 · May 26, 2025, 2:32pm

Hi @fgauthier thanks for bringing up this question. Would you like to elaborate a bit more on why binary size is a problem in your inference use case?