I need to use PyTorch model from existing C++/OpenCV based application. All images are processed with OpenCV's CUDA modules. Currently, I have to copy all the data back to CPU and use boost::python converters to make NumPy array from it, which I can use to construct PyTorch Tensor object. It works, but introduces major slowdowns (of course).
Is there a way to prepare data for PyTorch backend in such a way, that all the data would be always on GPU?
I know, that PyTorch shares its backend with original Torch (to some extent), so is this project of any use?
Looking forward for any clues