How to get multiple Tensors returned from CUDA extension

Hi there,

I want to transfer two tensors into my self-defined CUDA extension and return them back after some operations. However, as C++ only support to return one single variable, this seems to be a bit hard to implement. I tried to use boost tuple to combine them, but got an error shown below:

undefined symbol: _ZN5boost6python23throw_error_already_setEv

I also tried to use data_ptr() to get the address of the tensor as an int value, however, I don’t know how to cast the int value into a Tensor pointer in my cuda program. Do you know how to deal with that?

Hi,

Internally, using variable_list = std::vector<Variable>; is used.
I am not sure if pybind will convert that properly though :confused:

As @albanD already pointed out. You can use std::vector<torch::Tensor> to wrap multiple tensors in a vector. Pybind11 will convert that properly to a list of torch tensors.

This Dummy function just outputs N boring (all-zeros) torch tensors.

#include <torch/extension.h>
using namespace std;

std::vector<torch::Tensor> Dummy(int const N) {
    std::vector<torch::Tensor> outputs;
    auto out = torch::zeros({1,2}, torch::dtype(torch::kFloat32));
    for (int n=0; n<N; n++)
        outputs.push_back(out.clone());
    return outputs;
}
PYBIND11_MODULE(TORCH_EXTENSION_NAME, m) {
  m.def("Dummy", &Dummy, "Dummy function.");

After building, we can use it in Python as

import torch
import dummyextension

outputs = dummyextension.Dummy(3)
print(outputs)
>> [tensor([[0., 0.]]), tensor([[0., 0.]]), tensor([[0., 0.]])]

Of course pythons list unpacking will work as well:

out1, out2 = dummyextension.Dummy(2)