How to convert vector<int> into c++ torch tensor?

markb · January 13, 2020, 4:00pm

Hi,
I have a vector of int. All I want is to convert such vector into torch tensor.
I tried this:

vector<int> x = {10, 15, 20, 100, 500};
vector<torch::jit::IValue> inputs;
inputs.push_back(torch::from_blob(x.data(), {1, x.size()}, torch::kInt64));

But that returns me different numbers(see below).

6.4425e+10  4.2950e+11  1.3972e+14  2.5700e+02  9.4817e+13
[ Variable[CPULongType]{1,5} ]

What I can still do is this:

vector<int> x = {10, 15, 20, 100, 500};
vector<torch::jit::IValue> inputs;
torch::Tensor t = torch::zeros({1, (int)x.size()}, torch::kInt64);
int counter = 0;
for (int i : x)
{
    t[0][counter] = i;
    counter++;
}
inputs.push_back(t);

But is not there other way how to convert vector into torch::Tensor?

dhpollack · January 13, 2020, 4:10pm

I had some issues creating tensors with a long type directly, but I’m sure that is more that I don’t know what I’m doing in c++ rather than it not working. However, I did something like the following recently for this.

vector<int> v({1, 2, 3});
auto opts = torch::TensorOptions().dtype(torch::kInt32);
torch::Tensor t = torch::from_blob(t.data(), {3}, opts).to(torch::kInt64);

markb · January 14, 2020, 8:14am

Thanks for the reply. It worked I think my problem was setting its type directly to torch::kInt64 instead of setting it to torch::kInt32 and then converting it to torch::kInt64.

dhpollack · January 14, 2020, 10:07am

No worries, mini-update to the answer. If one makes vector<int> a vector<long> then you can directly convert to a long tensor.

HarveySpecter · February 8, 2022, 9:53pm

I ran into a similar issue and figured the problem is due to torch::from_blob not taking ownership of the vector from which the tensor has been created. Solution for this is to use torch::from_blob with clone().

For example, in the OP’s question, if the inputs are created from a vector vec in a certain scope, but used when vec is no longer in scope, then the inputs is likely to have garbage values.

    torch::Tensor inputs, cloned_inputs;
    {
        std::vector<long> vec = {10, 15, 20, 100, 500};
        auto options = torch::TensorOptions().dtype(at::kLong);
        inputs = torch::from_blob(vec.data(), {1, vec.size()}, options);
        cloned_inputs = torch::from_blob(vec.data(), {1, vec.size()}, options).clone();
        std::cout << "inputs within scope of vec: \n" << inputs << std::endl;
        std::cout << "cloned_inputs within scope of vec: \n" << cloned_inputs << std::endl;
    }
    std::cout << "inputs beyond scope of vec: \n" << inputs << std::endl;
    std::cout << "cloned_inputs beyond scope of vec: \n" << cloned_inputs << std::endl;

This ouputs:

inputs within scope of vec: 
  10   15   20  100  500
[ CPULongType{1,5} ]
cloned_inputs within scope of vec: 
  10   15   20  100  500
[ CPULongType{1,5} ]
inputs beyond scope of vec: 
 9.4045e+13  1.0000e+00  1.0000e+00  0.0000e+00  5.0000e+02
[ CPULongType{1,5} ]
cloned_inputs beyond scope of vec: 
  10   15   20  100  500
[ CPULongType{1,5} ]

phetdam · September 10, 2024, 2:14am

So there is a way to do this without copying the vector’s data by use of move semantics and placement new. The overall idea here is that we can std::move from the vector or another object that stores data in a contiguous buffer, placement new into a managed buffer, and then use torch::from_blob with a custom deleter that will clean up the memory on tensor deletion.

For simplicity I show the C++17 approach; I’ve used this before with success.

// traits classes for PyTorch tensor type constants
template <typename T>
struct tensor_type_traits {};

// for 32-bit signed int
template <>
struct tensor_type_traits<int> {
  static constexpr auto typenum = torch::kInt32;
};

// ... e.g. for double (kFloat64), float (kFloat32), etc.

/**
 * Return a 1D PyTorch tensor from an STL vector.
 *
 * @tparam T Element type
 * @tparam A Allocator type
 *
 * @param vec Vector to consume
 * @param options Tensor creation options
 */
template <typename T, typename A>
torch::Tensor make_tensor(
  std::vector<T, A>&& vec, torch::TensorOptions options = {})
{
  using V = std::vector<T, A>;
  // allocate storage for placement new (on exception also prevents leaks)
  auto buf = std::make_unique<unsigned char[]>(sizeof(V));
  // placement new + get pointer to moved vector
  auto vptr = new(buf.get()) V{std::move(vec)};
  // create PyTorch 1D tensor
  auto ten = torch::from_blob(
    vptr->data(),
    {vptr->size()},
    // note: argument is unused since we are deleting through vptr
    [vptr](void*)
    {
      // take ownership of the buffer for later deletion on scope exit
      std::unique_ptr<unsigned char[]> vbuf{(unsigned char*) vptr};
      vptr->~V();
    },
    // data type determined via traits class specializations
    options.dtype(tensor_type_traits<T>::typenum)
  );
  // we only release the buffer now in case from_blob throws
  buf.release();
  return ten;
}

We can use the make_tensor() function template as follows:

// some STL vector
std::pmr::vector<double> vec{4., 2.322, 2.432, 6.34, 5.343};
// create 1D tensor with gradient requirement by consuming vector
auto ten = make_tensor(std::move(vec), torch::requires_grad());
// ...

You do need to be careful that you are not resizing ten because the Tensor does not actually know anything about its underlying storage but if you just need a tensor as an input so you can call backward() and grad() this is sufficient. Of course, this can be extended if necessary.

I’ve made similar overloads for Eigen3 matrices; since they can be row- or column-major (defaulting to the latter), a bit of if constexpr is required to determine strides.