I wanted to reduce the size of Pytorch models since it consumes a lot of GPU memory and I am not gonna train them again.
First, I thought I could change them to TensorRT engine.
and then I was curious how I can calculate the size of gpu memory that it uses.
Pytorch model size can be calculated by
calculating using model.parameters() and model.buffers()
I checked if the above results had same values and they had.
But the size of TensorRT engine or other Scripted modules for GPU cannot be calculated by the above torch functions.
So I thought I could check the gpu memory usage size with GPUtil library.
However, the memory usage size that was calculated by GPUtil library (using nvidia-smi) was too different.
For example, one model has 13MiB size but almost 2 GiB was allocated in GPU. The other model has 171MiB but also around 2GiB was allocated in GPU. I didn’t put other objects such as inputs in GPU.
and Even after deleting the model,
del model gpu = GPUtil.getGPUs() memoryUsed = gpu.memoryUsed
memory was still not 0, while torch.cuda.memory_allocated(0) shows 0.
how do you calculate the GPU memory that a pytorch model uses?
or how do you compare the GPU memory that a pytorch model uses and its script-mode uses?
and if I understood right and used the right functions, why is the actual allocated memory that different from real torch tensor bytes?
I knew it could be different because of using defined page sizes but I didn’t expect that it could be that much different (2GB difference).