Torch.load(model) hangs indefinately when invoked via python bindings

Hello,

When trying to invoke the pytorch inference code from c++ using python binding the code gets hung indefinitely in torch.load(model, map_location=“cuda:0”) for GPU models. However for cpu models I have no issue.

Although the GPU model hangs if invoked via python -bindings, the same inference code run successfully for both GPU and CPU models when invoked via python interpreter.

Following is the detail of my setup:
Installation command: conda install pytorch torchvision cudatoolkit=10.2 -c pytorch
nvcc version:
nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2019 NVIDIA Corporation
Built on Wed_Oct_23_19:24:38_PDT_2019
Cuda compilation tools, release 10.2, V10.2.89

nvidia-smi:

Any clue?

What kind of Python binding are you using?
Could you post the C++ code, which creates this issue and if possible, how you’ve exported the model in Python?

@ptrblck, thanks for the reply.

Following is the sample code that loads the pytorch code “pytorch_inference.py” and calls inference function inference_code(data_from_c++) from c++:

#include <Python>

//============Pytorch File loading=============================

    Py_Initialize();
    PyRun_SimpleString("import sys\nsys.path.append('./pytorch_dir')"); //path to dir where my pytorch code  pytorch_inference.py resides
    PyObject * pName_pytorch_file = PyUnicode_FromString("pytorch_inference"); //load pytorch_inference.py
    PyObject * pModule_pytorch_module = PyImport_Import(pName_pytorch_file);    
    if (pModule_pytorch_module != NULL) {
        PyObject * pFunc_pytorch_inference_pointer = PyObject_GetAttrString(pModule_pytorch_module, "inference_code"); //load the offset address of python function to be called from c++
		std::cout << "[Python Func]: offset set" << std::endl;       
    }
    else
    {
        std::cout << "Module failed to load " << std::endl;
        assert(0);
    }
//================call to specific function of python from c++====

if (pFunc_pytorch_inference_pointer && PyCallable_Check(pFunc_pytorch_inference_pointer)) {	
	pArgs_to_pytorch_func = PyTuple_New(1);
    PyObject* data= PyArray_SimpleNewFromData(1,dims, PyArray_USHORT, bufferOfData);
	PyTuple_SetItem (pArgs_to_pytorch_func, 0, data);
    PyObject *Dict  = PyObject_CallObject(pFunc_pytorch_inference_pointer, data);  //this calls the pytorrch function "inference_code" with some data
}


//====================================================================

And the pytorch code pytorch_inference.py:

//====================Python side pytorch_inference.py ==========

print("import --endered")
import os
import pickle as pkl
import sys
import warnings
import shutil
import cv2
from PIL import Image
import numpy as np
import torch
import torch.nn as nn
import torch.nn.functional as F
import torch.nn.init
import torch.optim as optim
import torch.optim.lr_scheduler
import torch.utils.data as data
from torchvision.utils import save_image
import torchvision
print("import succedded-")
model_path = "./model_gpu.pth"

def inference_code(data_from_c++):
	dict_to_cpp = {}	
    torch.cuda.init()											#To test
    print("torch.cuda.is_available(): ",torch.cuda.is_available()) #To test: returned true
    print("torch.cuda.is_initialized(): ", torch.cuda.is_initialized())#To test: returned true
	torch.load(model_path,map_location="cuda:0")#----Hangs right here , but works for cpu models(for cpu I do not pass map_location="cuda:0")
	return dict
	
//===================================================================

How you’ve exported the model in Python?

I give the path to the model in the pytorch code itself:

model_path = "./model_gpu.pth"  #tried giving absolute path but still it hangs in torch.load
torch.load(model_path,map_location="cuda:0")

Is your script also hanging, if you use Python directly or just if you call it in C++ using the bindings?

Just out of curiosity, what’s your use case that you want to call Python from C++ instead of directly loading the model in C++?

@ptrblck, It hangs only when I call it from c++ and only for GPU models, CPU models runs fine even with the python bindings. If I call it from python it runs fine for both CPU and GPU models .

we have few post processing stuffs that are still evolving and needs to be done in python side itself for precipitation of design, and there are few other stuffs in c++ side that need not be implemented in python as they ought to be in c++. So, I just need the predictions from the python file in the form of dictionary, where I can use it in c++ to give the input to other module

@ptrblck,

Any clue on this? Lua based torch didn’t had any problem with gpu models when invoked from lua bindings, but the pytorch gets stuck in torch.load for GPU models when invoked via python bindings.

Unfortunately, I never called a Python script from C++, so I cannot be really helpful here. :confused:
A hang often comes from some multiprocessing issues / mismatches, so if possible, I would try to scale down the problem and use as few threads/processes as possible.

@ptrblck,

I kinda figured out the problem. The thing is, whenever I load the pytorch inference code from cpu-main thread it works, but if I call the same from the cpu-worker thread it hangs in torch.load for all GPU models.

Do you know what is going on?

@Alex_john any solution??
Same problem here

@RebirthT @Alex_john Did any one manage to find a solution to this I have exactly the same issue