How to estimate available gpu memory

I am using pytorch 1.2 and use the following code to estimate the available gpu memory:
h = nvmlDeviceGetHandleByIndex(0)
info = nvmlDeviceGetMemoryInfo(h)
#print(f’total : {}’)
print(f’free : {}’)
#print(f’used : {info.used}’)
print(‘torch cached: {}’.format(torch.cuda.memory_cached(0)))
print(‘torch allocated: {}’.format(torch.cuda.memory_allocated(0)))
remaining = + (torch.cuda.memory_cached(0) - torch.cuda.memory_allocated(0))

in summary, I use free + (cached - allocated) as the available gpu memory for further allocating tensors

However, I encountered an error:
RuntimeError: CUDA out of memory. Tried to allocate 5.57 GiB (GPU 0; 15.77 GiB total capacity; 5.82 GiB already allocated; 3.31 GiB free; 5.50 GiB cached)

I checked the code here:
It seems that “cached” as the error info pointed is stats.amount_cached - stats.amount_allocated
In this sense, this part can be used and free (3.31G) + cached (5.50G) is larger than what I am going to allocate (5.57G).

So what’s the reason? I guess I misunderstand some mechanisms. Could you give me a help? Thanks!

1 Like

actually in the runtime, I logged the cached and allocated. it seems that their difference is enough for the tensor to be allocated. So why does the error occur?