NVIDIA-SMI prints 37 GB allocated
but torch.cuda.memory_allocated()
prints 15GB allocated- im curious, why is there a discrepancy?
NVIDIA-SMI prints 37 GB allocated
but torch.cuda.memory_allocated()
prints 15GB allocated- im curious, why is there a discrepancy?
From the PyTorch docs torch.cuda.memory_allocated — PyTorch 2.1 documentation
This is likely less than the amount shown in nvidia-smi since some unused memory can be held by the caching allocator and some context needs to be created on GPU. See Memory management for more details about GPU memory management.
But TL;DR is torch.cuda.memory_allocated()
is the amount of memory allocated to tensors whereas there’s also the CUDA context and cuda caching allocator which are being shown in nvidia-smi