How to calculate the GPU memory that a model uses?

I wanted to reduce the size of Pytorch models since it consumes a lot of GPU memory and I am not gonna train them again.

First, I thought I could change them to TensorRT engine.
and then I was curious how I can calculate the size of gpu memory that it uses.

Pytorch model size can be calculated by


calculating using model.parameters() and model.buffers()

I checked if the above results had same values and they had.

But the size of TensorRT engine or other Scripted modules for GPU cannot be calculated by the above torch functions.
So I thought I could check the gpu memory usage size with GPUtil library.

However, the memory usage size that was calculated by GPUtil library (using nvidia-smi) was too different.
For example, one model has 13MiB size but almost 2 GiB was allocated in GPU. The other model has 171MiB but also around 2GiB was allocated in GPU. I didn’t put other objects such as inputs in GPU.

and Even after deleting the model,

del model
gpu = GPUtil.getGPUs()[0]
memoryUsed = gpu.memoryUsed

memory was still not 0, while torch.cuda.memory_allocated(0) shows 0.

how do you calculate the GPU memory that a pytorch model uses?
or how do you compare the GPU memory that a pytorch model uses and its script-mode uses?

and if I understood right and used the right functions, why is the actual allocated memory that different from real torch tensor bytes?
I knew it could be different because of using defined page sizes but I didn’t expect that it could be that much different (2GB difference).

PyTorch will create the CUDA context in the first CUDA operation, which will load the driver, kernels (native from PyTorch as well as used libraries etc.) and will take some memory overhead depending on the device.
PyTorch doesn’t report this memory which is why torch.cuda.memory_allocated() could return a 0 allocation.
You would thus need to use nvidia-smi (or any other “global” reporting tool) to check the overall GPU memory usage.

1 Like