How to know if tensor fits in gpu ram?

Is there a way to know whether a tensor will fit into the remaining GPU ram, before creating it?

1 Like

I’ve been trying to figure out the same thing. Recent paper “DNNMem” apparently can analyze a model file to figure out its size BEFORE loading into memory, but they haven’t released any code. Instead, what I’ve been doing at least is adding a lot of exception handling to respond to CUDA OOMs.

I’ve tried to calculate free memory by total_memory - memory_allocated or total_memory - memory_reserved, I’ve also tried to do it with pynvml, here is the code snippet:

def remaining_memory():
  pynvml.nvmlInit()
  gpu_handle = pynvml.nvmlDeviceGetHandleByIndex(0)
  info = pynvml.nvmlDeviceGetMemoryInfo(gpu_handle)
  return info.free

and none of them are accurate.
I guess it has something to do with memory fragmentation, if the remaining space is not contiguous, you can’t fit a large tensor into it even if it shows there are enough free memory. I just want to know how pytorch checks whether a tensor will fit in memory.