List all tensors with their names and size

NightMachinery · May 14, 2023, 8:40pm

To profile the memory usage, I want to list all tensors with their name and size.

I have a function that can show all tensors with their size:

def pretty_size(size):
    """Pretty prints a torch.Size object"""
    assert isinstance(size, torch.Size)
    return " x ".join(map(str, size))


def dump_tensors(gpu_only=True):
    """Prints a list of the Tensors being tracked by the garbage collector."""
    import gc

    total_size = 0
    for obj in gc.get_objects():
        try:
            if torch.is_tensor(obj):
                if not gpu_only or obj.is_cuda:
                    print(
                        "%s:%s%s %s"
                        % (
                            type(obj).__name__,
                            " GPU" if obj.is_cuda else "",
                            " pinned" if obj.is_pinned else "",
                            pretty_size(obj.size()),
                        )
                    )
                    total_size += obj.numel()
            elif hasattr(obj, "data") and torch.is_tensor(obj.data):
                if not gpu_only or obj.is_cuda:
                    print(
                        "%s → %s:%s%s%s%s %s"
                        % (
                            type(obj).__name__,
                            type(obj.data).__name__,
                            " GPU" if obj.is_cuda else "",
                            " pinned" if obj.data.is_pinned else "",
                            " grad" if obj.requires_grad else "",
                            " volatile" if obj.volatile else "",
                            pretty_size(obj.data.size()),
                        )
                    )
                    total_size += obj.data.numel()
        except Exception as e:
            pass
    print("Total size:", total_size)

But I don’t know how to get the name of the tensors. By name, I mean, ideally, the parameter name the tensor has, but barring that, the names of the variables holding references to it.

tom · May 15, 2023, 6:25am

Tensors don’t know their names, and they might not all have names.
What you can do - at the expense of speed - if your tensors require gradients is to use the anomaly mode to get the lines of the instantiations:

with torch.autograd.detect_anomaly():
     a = torch.randn(5, 5, requires_grad=True)
     b = a * 2 + 1

then

b.grad_fn.metadata['traceback_'][-1]

has

'  File "<ipython-...>", line 3, in <module>\n    b = a * 2 + 1\n'

This only works on tensors inside the autograd graph.

Another approach (with better coverage) could be to use the __torch_function__ hook to record the source line from where the tensor is created.