Determine peak memory requirement

Suppose I perform my_awesome_cuda_training() and I want to determine the maximum amount of memory that was used. Is that as simple as using torch.cuda.memory_stats()?

import torch

def my_awesome_cuda_training():
    torch.empty(2, *[1024] * 3, dtype=torch.uint8, device="cuda")


stats = torch.cuda.memory_stats()
peak_bytes_requirement = stats["allocated_bytes.all.peak"]
print(f"Peak memory requirement: {peak_bytes_requirement / 1024 ** 3:.2f} GB")
Peak memory requirement: 2.00 GB

It seems to work as intended, but I would be grateful for confirmation.

1 Like


You can try the below!

# pip install pynvml
from pynvml import * # too lazy, not a good thing
def show_gpu(msg):
    handle = nvmlDeviceGetHandleByIndex(0)
    info = nvmlDeviceGetMemoryInfo(handle)
    total =
    free =
    used = info.used
    pct = used/total;
    print('\n' + msg, f'Used {100*pct:2.1f}% ({used} out of {total} {free}')

That would get me the currently used memory, but that is not what I need: I need the peak memory requirement up to a specific point.

In the snippet above no CUDA memory is used when I print the peak requirement.

You can try this then,

As far as I can see this also only gives me the current load and not the maximum load since I’ve started my script.

Well you can keep on collecting the stats over a span of time?

Because it will change with time, so nothing can give you stats straightforward, right?

The whole point of determining the peak memory usage is that I don’t want do intermediate manually checking. For me it is irrelevant how much memory is used at a specific time. All I want is to determine after my code has run how much memory was used at a maximum, i. e. how much memory is required to run my code.

Yes, the .peak stats will give you the maximum. You can use torch.cuda.reset_peak_memory_stats() to reset this peak if you need to monitor another peak usage.