Custom dataset with C++ backend

Hi all,

How would one go about extending the basic dataset and writing a backend for it in C++?

I have read the Contribution Guide. It is suggested there that new Ops should be added to ATen native package:

Modern implementations of operators. If you want to write a new operator, here is where it should go. Most CPU operators go in the top level directory, except for operators which need to be compiled specially; see cpu below.

However, I don’t want my backend to be a method on a Tensor or a Variable. I would like to be able to create more than just a function. I would like to connect to a database using api in C++ and use that to create a python DataSet object.

For that I would need more than just a function. I would like to be able to create a c++ object and call it from python code. Any ideas how I could do that?

I would also like it to be a part of the pytorch project and not a cusom extention like from this tutorial. Any suggestions where would be the best place to put it?

I think you want to add a python module using C++. 1. Extending Python with C or C++ — Python 3.9.6 documentation
Or pybind would be a great package to use.

Then you can import the custom module and use it within Dataset

Yeah, thanks for the suggestion. Although, as I mentioned, I would like it to be a part of pytorch package and fit within the structure there.