C++ seeming memory deallocation of a temporary variable

Hello!

By implementing some c++ extensions I noticed that if I transfer GPU tensor to CPU using one-liner (auto my_tensor_ptr = my_tensor.cpu().data_ptr<float>();), usually what happens is that this pointer becomes some kind of a dangling pointer and data indexed via brackets operator [] is not valid anymore. It can be easily fixed by introducing a temporary variable:

auto my_tensor_cpu = my_tensor.cpu();
auto my_tensor_ptr = my_tensor_cpu.data_ptr<float>();

Nevertheless, I sill don’t understand why does this deallocation occurs.
Any explanation will be highly appreciated.

Hi,

This is because the data returned by data_ptr<>() is only valid as long as the original Tensor exists.
But here the CPU tensor goes out of scope so the the data_ptr becomes invalid. This is expected.

1 Like

Hi,

Yes, this is clear.
But why does it cease to exist? Does it go out of its scope? What is its scope? By all means, I am not a c++ expert, but the PyTorch code is so entangled I can’t even find where the cpu method is implemented (maybe here) to check how this tensor is created.

This would actually be the same in python. If you create an object in the middle of the line. It will be gone by the end of it if you don’t reference it.
Here you create a Tensor when doing my_tensor.cpu() (let’s call it foo). Then this object is used to call .data_ptr<float>() on it. The the result of that new function (a float* ) is saved into your variable my_tensor_ptr. At this point, nothing references foo anymore. So foo is de-allocated.

1 Like

I think it is not exactly as in Python, because in Python it may be gced or may be not but it looks like in c++ it is guaranteed(All temporary objects are destroyed as the last step in evaluating the full-expression that (lexically) contains the point where they were created) to be destroyed. So, is the object returned by cpu() method allocated in the stack? I assume if it was dynamically created, some sort of memory leak would have occurred.
Still, it feels like created cpu tensor should contain a week reference to the created pointer to prevent it from deallocation and this would allow that one-liner.

This is not possible to do. data_ptr<float>() returns just a raw pointer to float. We cannot attach any reference or anything to it.
It is the same as if you get a raw pointer to an object in cpp and then delete the object. The raw pointer never keeps the object alive.

Note that in cpython, the life of objects is refcounted and so (unless you create refcycle which are very rare), they are destroyed as soon as they are not referenced anymore.

1 Like