Torch.cuda.max_memory_reserved example please

I just want to know how many of a certain type of GPU I need to rent out that my training job will fit into based on the per GPU vRAM, ie the value that says 494MiB / 46068MiB under Memory-Usage when you do nvidia-smi -l, so that I dont get a surprise CUDA OOM error during my training and need to decrease batch_size.

i think im probably misunderstanding the meaning of the word reset here

https://pytorch.org/docs/stable/generated/torch.cuda.max_memory_allocated.html#torch.cuda.max_memory_allocated

because I am not getting this to return to zero memory

import torch

def max_memory_allocated_reserved(device="cuda:0"):
    
    gpu_memory_allo_bytes = torch.cuda.max_memory_allocated(device="cuda:0")
    gpu_memory_allo_gb = gpu_memory_allo_bytes/1024 ** 3

    gpu_memory_res_bytes = torch.cuda.max_memory_reserved(device="cuda:0")
    gpu_memory_res_gb = gpu_memory_res_bytes/1024 ** 3

    print(f"{device} reserved {gpu_memory_res_gb} GB, allocated {gpu_memory_allo_gb} GB")

    torch.cuda.empty_cache()
    torch.cuda.reset_peak_memory_stats(device)

    return gpu_memory_res_gb, gpu_memory_allo_gb
    


import torchvision.models as models

device="cuda:0"

model = models.resnet18().to(device)

inputs = torch.randn(5, 3, 224, 224).to(device)

max_memory_allocated_reserved(device)

# zero memory ?

max_memory_allocated_reserved(device)

inputs = torch.randn(100, 3, 224, 224).to(device)

max_memory_allocated_reserved(device)

outputs = model(inputs)

max_memory_allocated_reserved(device)

I want to reset the memory “watching” interval and expect the 2nd set of print statements to be “0” as well as to clear “cuda:0” of the tensors, but even when inside max_memory_allocated_reserved i do

del model
del inputs

i still see 494MiB / 46068MiB under Memory-Usage when I do nvidia-smi -l, further more, without deleting the tensors i still see 0.0464468 GB allocated and 0.0625 GB reserved.

thank you!