I am trying to run multiple processes that shares the same GPU. I need to release GPU memory so that another process can use it.
By reading the docs I understand that pytorch does not release GPU memory right away, and I need to cuda.empty_cache
But I found that nvidia-smi stays high even after empty_cache
. cuda.memory_summary
says that current usage is zero.
What am I missing?
import torch
from torchvision import models
net = models.alexnet(pretrained=True)
net.cuda()
del net
torch.cuda.empty_cache()
print(f"{torch.cuda.memory_allocated()} {torch.cuda.max_memory_allocated()}") # 0 244797440
print(f"{torch.cuda.memory_reserved()} {torch.cuda.max_memory_reserved()}") # 0 257949696
print(torch.cuda.memory_summary())
import time
time.sleep(100)
output of nvidia-smi
Thu Feb 10 06:45:21 2022
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 460.106.00 Driver Version: 460.106.00 CUDA Version: 11.2 |
|-------------------------------+----------------------+----------------------+
| GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. |
| | | MIG M. |
|===============================+======================+======================|
| 0 GeForce GTX TIT... On | 00000000:AF:00.0 Off | N/A |
| 22% 50C P2 73W / 250W | 476MiB / 12210MiB | 0% Default |
| | | N/A |
+-------------------------------+----------------------+----------------------+
+-----------------------------------------------------------------------------+
| Processes: |
| GPU GI CI PID Type Process name GPU Memory |
| ID ID Usage |
|=============================================================================|
| 0 N/A N/A 32247 C python3 471MiB |
+-----------------------------------------------------------------------------+
output of torch.cuda.memory_summary()
|===========================================================================|
| PyTorch CUDA memory summary, device ID 0 |
|---------------------------------------------------------------------------|
| CUDA OOMs: 0 | cudaMalloc retries: 0 |
|===========================================================================|
| Metric | Cur Usage | Peak Usage | Tot Alloc | Tot Freed |
|---------------------------------------------------------------------------|
| Allocated memory | 0 B | 239060 KB | 239060 KB | 239060 KB |
| from large pool | 0 B | 238928 KB | 238928 KB | 238928 KB |
| from small pool | 0 B | 132 KB | 132 KB | 132 KB |
|---------------------------------------------------------------------------|
| Active memory | 0 B | 239060 KB | 239060 KB | 239060 KB |
| from large pool | 0 B | 238928 KB | 238928 KB | 238928 KB |
| from small pool | 0 B | 132 KB | 132 KB | 132 KB |
|---------------------------------------------------------------------------|
| GPU reserved memory | 0 B | 251904 KB | 251904 KB | 251904 KB |
| from large pool | 0 B | 249856 KB | 249856 KB | 249856 KB |
| from small pool | 0 B | 2048 KB | 2048 KB | 2048 KB |
|---------------------------------------------------------------------------|
| Non-releasable memory | 0 B | 21236 KB | 28613 KB | 28613 KB |
| from large pool | 0 B | 19280 KB | 26528 KB | 26528 KB |
| from small pool | 0 B | 2044 KB | 2085 KB | 2085 KB |
|---------------------------------------------------------------------------|
| Allocations | 0 | 16 | 16 | 16 |
| from large pool | 0 | 7 | 7 | 7 |
| from small pool | 0 | 9 | 9 | 9 |
|---------------------------------------------------------------------------|
| Active allocs | 0 | 16 | 16 | 16 |
| from large pool | 0 | 7 | 7 | 7 |
| from small pool | 0 | 9 | 9 | 9 |
|---------------------------------------------------------------------------|
| GPU reserved segments | 0 | 5 | 5 | 5 |
| from large pool | 0 | 4 | 4 | 4 |
| from small pool | 0 | 1 | 1 | 1 |
|---------------------------------------------------------------------------|
| Non-releasable allocs | 0 | 4 | 4 | 4 |
| from large pool | 0 | 2 | 2 | 2 |
| from small pool | 0 | 2 | 2 | 2 |
|===========================================================================|
I am using torch 1.5.1 and torchvision 0.6.1 on ubuntu 18.04 machine.