Detect model size

Let’s say we have 10 neural network models. In order to decide how to partition these models amongst a set of GPUs, we need to know the size of each model. How do we calculate that?

You can estimate the memory footprint of the model itself by summing the number of parameters, buffers (, and other tensors, if needed) and multiply it by the dtype factor (e.g. 4 for float32).
However, this would not give you the “complete” memory usage, since the forward activations (intermediates) as well as the gradients would also use memory. That being said, if you are using cudnn, then note that different algorithms would also consume a different amount of memory for their workspace, especially if you are using torch.backends.cudnn.benchmark = True.
I would thus recommend to perform an example training step with the shapes you are planning to use and check the memory usage e.g. via torch.cuda.memory_summary().

1 Like

The problem is usually these models I am referring to are too big to fit in memory, and the point of calculating the memory is we are deciding how to distribute each sub-model onto multiple GPUs. So, I can’t load the entire model first to calculate the size.

Do you have any code for the first method of calculating memory?