Assign memory blob to py-torch output tensor (C++ API)

I am training a linear model using py-torch and I am saving it to a file with the “save” function call. I have another code that loads the model in C++ and performs inference.
I would like to instruct the Torch CPP Library to use a specific memory blob at the final output tensor. Is this even possible? If yes, how? Below you can see a small example of what I am trying to achieve.


#include <iostream>
#include <memory>

#include <torch/script.h>

int main(int argc, const char* argv[]) {
  if (argc != 3) {
    std::cerr << "usage: example-app <path-to-exported-script-module>\n";
    return -1;
  }
  long numElements = (1024*1024)/sizeof(float) * atoi(argv[2]);

  float *a = new float[numElements]; 
  float *b = new float[numElements];
  float *c = new float[numElements*4];

  for (int i = 0; i < numElements; i++){
    a[i] = i;
    b[i] = -i;
  }

  //auto options = torch::TensorOptions().dtype(torch::kFloat64);
  at::Tensor a_t = torch::from_blob((float*) a, {numElements,1});
  at::Tensor b_t = torch::from_blob((float*) b, {numElements,1});
  at::Tensor out = torch::from_blob((float*) b, {numElements,4});

  at::Tensor c_t = at::cat({a_t,b_t}, 1);
  at::Tensor d_t = at::reshape(c_t, {numElements,2});

  torch::jit::script::Module module;
  try {
    module = torch::jit::load(argv[1]);
  }
  catch (const c10::Error& e) {
    return -1;
  }


  out =  module.forward({d_t}).toTensor();
  std::cout<< out.sizes() << "\n";

  delete [] a;
  delete [] b;
  delete [] c;

  return 0;
}

So, I am allocating memory into “c” and then I am casting creating a tensor out of this memory. I store this memory into a tensor named “out”. I load the model when I call the forward method. I observe that the resulted data are copied/moved into the “out” tensor. However, I would like to instruct Torch to directly store into “out” memory. Is this possible?