I want to transfer two tensors into my self-defined CUDA extension and return them back after some operations. However, as C++ only support to return one single variable, this seems to be a bit hard to implement. I tried to use boost tuple to combine them, but got an error shown below:
I also tried to use data_ptr() to get the address of the tensor as an int value, however, I don’t know how to cast the int value into a Tensor pointer in my cuda program. Do you know how to deal with that?
As @albanD already pointed out. You can use std::vector<torch::Tensor> to wrap multiple tensors in a vector. Pybind11 will convert that properly to a list of torch tensors.
This Dummy function just outputs N boring (all-zeros) torch tensors.
#include <torch/extension.h>
using namespace std;
std::vector<torch::Tensor> Dummy(int const N) {
std::vector<torch::Tensor> outputs;
auto out = torch::zeros({1,2}, torch::dtype(torch::kFloat32));
for (int n=0; n<N; n++)
outputs.push_back(out.clone());
return outputs;
}
PYBIND11_MODULE(TORCH_EXTENSION_NAME, m) {
m.def("Dummy", &Dummy, "Dummy function.");