Overwriting from_blob tensor data

tmaric · May 31, 2022, 5:20pm

What do words “exposes” and “without taking ownership” in from_blob

Exposes the given data as a Tensor without taking ownership of the original data.

https://pytorch.org/cppdocs/api/function_namespacetorch_1ac009244049812a3efdf4605d19c5e79b.html?highlight=from_blob

actually mean? Are internally copy perations done on data by

at::[Tensor](https://pytorch.org/cppdocs/api/classat_1_1_tensor.html#_CPPv4N2at6TensorE) torch:: from_blob (void **data* , at::IntArrayRef *sizes* , at::IntArrayRef *strides* , *const* at::TensorOptions &*options* = at::TensorOptions())

?

Is it possible to have a huge C array in memory and use from_blob to “view” it as a Tensor and also modify the arrays data? Example:

#include <torch/torch.h>
#include <iostream>
#include <array> 
    
int main ()
{
    float array[] = {1.23, 2.34, 3.45, 4.56, 5.67};
    auto options = torch::TensorOptions().dtype(torch::kFloat32);
    torch::Tensor t1 = torch::from_blob(array, {5}, options);
    std::cout << t1 << std::endl;

    t1 = torch::ones_like(t1); 

    std::cout << t1 << std::endl;

    for (auto el : array)
        std::cout << el << std::endl;
    return 0;
}

I expected this code to overwrite the array elements, but the output shows they are not overwritten,

 ./example-app 
 1.2300
 2.3400
 3.4500
 4.5600
 5.6700
[ CPUFloatType{5} ]
 1
 1
 1
 1
 1
[ CPUFloatType{5} ]
1.23
2.34
3.45
4.56
5.67

Are all Tensor constructors deep-copy operations, or is there one that imposes a “write” view on an array?

If it is not possible to construct a Tensor without copying data, what would be the the best way to assign values of t1 (in this simple example, the ones_like) to array? Is there a way to get a pointer to the data from Tensor?

ptrblck · June 1, 2022, 4:10am

from_blob should create a “view” to the underlying object and inplace operations should thus also change the original array.
However, torch.ones_like will not create another view but a new tensor and will override the t1 variable.

tmaric · June 1, 2022, 7:07am

OK, I just find that kind of strange: yes, ones_like generates another tensor, but shouldn’t the assignment operator of Tensor simply assign whatever ones_like creates to t1 without altering its view? Something like

void Tensor::operator=(Tensor const& rhs){
  this.data = rhs.data // in a sense
}

or

void Tensor::operator=(Tensor&& rhs){
  this.data = std::move(rhs.data) // in a sense
}

My real scenario looks something like

void* huge_data = complexObject.pointer(); 
Tensor nn_forward = from_blob(huge_data) 

// train network 

nn_forward = nn->forward()

So I am guessing now after seeing your answer that nn->forward() also creates a new Tensor and overrides nn_forward, and removes the non-const view of huge_data. This means I need to do a copy-assign from the new nn_forward to huge_data and it also means that I am storing 2 huge_data variables in memory - one in complexObject that I want to modify, and another one genreated by forward().

Is there a way to circumvent this and make forward write to huge_data via the non-const view to huge_data available in nn_forward somehow?

ptrblck · June 1, 2022, 7:19am

If you want to write directly to huge_data, you could check if the last operation in your model supports the out argument so that you could pass this tensor to it (assuming the shape etc. is correct).

tmaric · June 1, 2022, 8:59am

I’m using Sequential, and (using Python as pseudocode)

def forward(sequential, input):
  for module in sequential:
    input = module(input)
  return input

This looks like Sequential does not offer the output variable as an argument to forward, is this true?

tmaric · June 1, 2022, 9:25am

I’ve solved it with a for loop (forAll is custom OpenFOAM’s macro for the standard C++ for loop)

    // Evaluate the best NN. 
    //  - Reinterpret OpenFOAM's output volScalarField as scalar* array 
    volScalarField::pointer vf_nn_data = vf_nn.ref().data();
    //  - Use the scalar* (volScalarField::pointer) to view 
    //    the volScalarField as torch::Tensor without copying data. 
    torch::Tensor vf_nn_tensor = torch::from_blob(vf_nn_data, {vf.size()});
    //  - Evaluate the volumeScalarField vf_nn using the best NN model.
    vf_nn_tensor = nn_best->forward(cc_tensor);
    //  - FIXME: 2022-06-01, the C++ PyTorch API does not overwrite the blob object.
    //           If a Model is coded by inheritance, maybe forward(input, output) is
    //           available, that overwrites the data in vf_nn by acting on the 
    //           non-const view of the data given by vf_nn_tensor. TM.
    forAll(vf_nn, cellI)
    {
        vf_nn[cellI] = vf_nn_tensor[cellI].item<double>();
    }
    //  - Evaluate the vf_nn boundary conditions. 
    vf_nn.correctBoundaryConditions()

in case someone else requires this, but I am not sure this is fast and it is definitely not clean… Also, it is relatively simple for a scalar data transfer to the source of huge_data, for vectors and other tensors it becomes cumbersome. I’ve checked the assignment operators in the C++ API, the documentation doesn’t say anything about data ownership transfer…and I didn’t find a deep copy option, std::copy fails because missing begin() end() mfunctions in Tensor.