In my experiment setting, I am trying to do some data pre-processing using C++ code. I have tried two approaches:
- Using subprocess to run C++ built executable and save the results to the hard disk. Then on the Python side, it reads the processed result.
- Using pybind11 to warp the C++ code, and return a buffer to Python directly, which can be converted to numpy array.
Both methods have been tested to be correct and coherent solely. However, when being integrated inside a dataloader (worker number > 1), the second method met some problem. The situation was that the training would get some NaN value very soon. I do not know how to look into this. Is there any suggestion and tip for me to use pybind11 warpped C++ code along with pytorch dataloader?