Discrepancy between NVIDIA-smi and torch.cuda.memory_allocated()

Surya_Narayanan · June 15, 2023, 8:22pm

NVIDIA-SMI prints 37 GB allocated

but torch.cuda.memory_allocated() prints 15GB allocated- im curious, why is there a discrepancy?

marksaroufim · June 15, 2023, 9:49pm

From the PyTorch docs torch.cuda.memory_allocated — PyTorch 2.1 documentation

This is likely less than the amount shown in nvidia-smi since some unused memory can be held by the caching allocator and some context needs to be created on GPU. See Memory management for more details about GPU memory management.

But TL;DR is torch.cuda.memory_allocated() is the amount of memory allocated to tensors whereas there’s also the CUDA context and cuda caching allocator which are being shown in nvidia-smi