I am running a model in eval mode. I wrote these lines of code after the forward pass to look at the memory in use.
print("torch.cuda.memory_allocated: %fGB"%(torch.cuda.memory_allocated(0)/1024/1024/1024))
print("torch.cuda.memory_reserved: %fGB"%(torch.cuda.memory_reserved(0)/1024/1024/1024))
print("torch.cuda.max_memory_reserved: %fGB"%(torch.cuda.max_memory_reserved(0)/1024/1024/1024))
which prints out
torch.cuda.memory_allocated: 0.004499GB
torch.cuda.memory_reserved: 0.007812GB
torch.cuda.max_memory_reserved: 0.007812GB
However, running nvidia-smi
tells me that python is using 1.349 GB. What causes the difference?
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 470.57.02 Driver Version: 470.57.02 CUDA Version: 11.4 |
|-------------------------------+----------------------+----------------------+
| GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. |
| | | MIG M. |
|===============================+======================+======================|
| 0 NVIDIA GeForce ... Off | 00000000:01:00.0 On | N/A |
| N/A 57C P0 33W / N/A | 2392MiB / 7982MiB | 3% Default |
| | | N/A |
+-------------------------------+----------------------+----------------------+
+-----------------------------------------------------------------------------+
| Processes: |
| GPU GI CI PID Type Process name GPU Memory |
| ID ID Usage |
|=============================================================================|
| 0 N/A N/A 1103 G /usr/lib/xorg/Xorg 106MiB |
| 0 N/A N/A 1702 G /usr/lib/xorg/Xorg 476MiB |
| 0 N/A N/A 1874 G /usr/bin/gnome-shell 87MiB |
| 0 N/A N/A 2331 G ...AAAAAAAAA= --shared-files 51MiB |
| 0 N/A N/A 4307 G /usr/lib/firefox/firefox 175MiB |
| 0 N/A N/A 4569 G /usr/lib/firefox/firefox 37MiB |
| 0 N/A N/A 21370 G ...AAAAAAAAA= --shared-files 33MiB |
| 0 N/A N/A 24668 G ...AAAAAAAAA= --shared-files 56MiB |
| 0 N/A N/A 25867 C python 1349MiB |
+-----------------------------------------------------------------------------+