When trying to invoke the pytorch inference code from c++ using python binding the code gets hung indefinitely in torch.load(model, map_location=“cuda:0”) for GPU models. However for cpu models I have no issue.
Although the GPU model hangs if invoked via python -bindings, the same inference code run successfully for both GPU and CPU models when invoked via python interpreter.
What kind of Python binding are you using?
Could you post the C++ code, which creates this issue and if possible, how you’ve exported the model in Python?
Following is the sample code that loads the pytorch code “pytorch_inference.py” and calls inference function inference_code(data_from_c++) from c++:
#include <Python>
//============Pytorch File loading=============================
Py_Initialize();
PyRun_SimpleString("import sys\nsys.path.append('./pytorch_dir')"); //path to dir where my pytorch code pytorch_inference.py resides
PyObject * pName_pytorch_file = PyUnicode_FromString("pytorch_inference"); //load pytorch_inference.py
PyObject * pModule_pytorch_module = PyImport_Import(pName_pytorch_file);
if (pModule_pytorch_module != NULL) {
PyObject * pFunc_pytorch_inference_pointer = PyObject_GetAttrString(pModule_pytorch_module, "inference_code"); //load the offset address of python function to be called from c++
std::cout << "[Python Func]: offset set" << std::endl;
}
else
{
std::cout << "Module failed to load " << std::endl;
assert(0);
}
//================call to specific function of python from c++====
if (pFunc_pytorch_inference_pointer && PyCallable_Check(pFunc_pytorch_inference_pointer)) {
pArgs_to_pytorch_func = PyTuple_New(1);
PyObject* data= PyArray_SimpleNewFromData(1,dims, PyArray_USHORT, bufferOfData);
PyTuple_SetItem (pArgs_to_pytorch_func, 0, data);
PyObject *Dict = PyObject_CallObject(pFunc_pytorch_inference_pointer, data); //this calls the pytorrch function "inference_code" with some data
}
//====================================================================
And the pytorch code pytorch_inference.py:
//====================Python side pytorch_inference.py ==========
print("import --endered")
import os
import pickle as pkl
import sys
import warnings
import shutil
import cv2
from PIL import Image
import numpy as np
import torch
import torch.nn as nn
import torch.nn.functional as F
import torch.nn.init
import torch.optim as optim
import torch.optim.lr_scheduler
import torch.utils.data as data
from torchvision.utils import save_image
import torchvision
print("import succedded-")
model_path = "./model_gpu.pth"
def inference_code(data_from_c++):
dict_to_cpp = {}
torch.cuda.init() #To test
print("torch.cuda.is_available(): ",torch.cuda.is_available()) #To test: returned true
print("torch.cuda.is_initialized(): ", torch.cuda.is_initialized())#To test: returned true
torch.load(model_path,map_location="cuda:0")#----Hangs right here , but works for cpu models(for cpu I do not pass map_location="cuda:0")
return dict
//===================================================================
How you’ve exported the model in Python?
I give the path to the model in the pytorch code itself:
model_path = "./model_gpu.pth" #tried giving absolute path but still it hangs in torch.load
torch.load(model_path,map_location="cuda:0")
@ptrblck, It hangs only when I call it from c++ and only for GPU models, CPU models runs fine even with the python bindings. If I call it from python it runs fine for both CPU and GPU models .
we have few post processing stuffs that are still evolving and needs to be done in python side itself for precipitation of design, and there are few other stuffs in c++ side that need not be implemented in python as they ought to be in c++. So, I just need the predictions from the python file in the form of dictionary, where I can use it in c++ to give the input to other module
Any clue on this? Lua based torch didn’t had any problem with gpu models when invoked from lua bindings, but the pytorch gets stuck in torch.load for GPU models when invoked via python bindings.
Unfortunately, I never called a Python script from C++, so I cannot be really helpful here.
A hang often comes from some multiprocessing issues / mismatches, so if possible, I would try to scale down the problem and use as few threads/processes as possible.
I kinda figured out the problem. The thing is, whenever I load the pytorch inference code from cpu-main thread it works, but if I call the same from the cpu-worker thread it hangs in torch.load for all GPU models.