Is there a way to get the underlying memory array size of a tensor?

E.g. I have this code snippet

x = torch.rand(1000, 1000)  # this will allocate a memory block of 4M in bytes
x1 = x[0:1]  # only selecting the first row. x1 itself contains only 1K floats, but for now it shares the underlying memory array with x
torch.save(x1, 'some/disk/file')  # after this, `some/disk/file` will be of size roughly 4M instead of 4K.

Hence the need: I want to avoid this extra space waste with some smart size checks before disk save (instead of calling .clone().contiguous() every time). Is there a way to do it? The usual .numel() etc. doesn’t suits this use case.

Or even better, are there any “somewhat low level” torch APIs to tell some information about the underlying memory array, e.g. whether two tensors are sharing the same memory array, and the size of the memory array.

Hi Fei!

Try some_tensor.untyped_storage() and some_tensor.data_ptr().

Best.

K. Frank

2 Likes

This is exactly what I’m looking for. Thank you!

I found a new tricky case:
If I construct a torch tensor by calling from_numpy on an np.ndarray slice, the tensor will hold the full shared memory of the np.array object but .untyped_storage().nbytes() only reveals the tensor’s occupied size (instead of the full shared size). The program will continue to cost the full underlying memory block even if the original ndarray is deallocated. See this code snippet as an example:

import time
import gc
import numpy as np
import torch

# Create an np.ndarray taking 8GB mem, then only use the first-element slice to construct the tensor
x = torch.from_numpy(np.ones(2 ** 30)[:1])

# At this stage the giant array (temp variable) goes out of scope. We call gc.collect() to make sure it's erased.
gc.collect()

print(x.untyped_storage().nbytes())  # output: 8

# However, top still shows that the program takes ~8GB mem. I paused the program and looked at the `top` results for 10 seconds to make sure ^_^!
time.sleep(10)

I’m pretty sure this is not related to garbage collection (gc is doing its work), since I tested it with various scenarios (e.g. a: only creating the large ndarray variable and see if mem size reduces after gc, or b: delete both the ndarray and the tensor and then gc). Results show that it indeed is the torch.tensor holding the large 8GB memory chunk (which is not revealed by its .untyped_storage().nbytes() ). Is there a way to identify such cases?