I have java background and I am new to Python, so my assumption was that if we close the process then memory should be released. but I find out that it is not true.
My question is:
Why PyTorch does not free the memory after execution is completed.
forPyGCL
is my conda env on a shared server, and the below table shows that it is using the memory 5457MiB
Why it does not free the memory, I can’t even run the next experiments. because now it says that memory is full. This problem wasted my 3 weeks and then I came to know that this is the problem that memory is not free after the execution.
torch.cuda.empty_cache() is use less in this case.
So i had to kill the process, but still getting this error.
RuntimeError: CUDA out of memory. Tried to allocate 1.02 GiB (GPU 0; 23.70 GiB total capacity; 5.20 GiB already allocated; 573.56 MiB free; 5.23 GiB reserved in total by PyTorch) If reserved memory is >> allocated memory try setting max_split_size_mb to avoid fragmentation. See documentation for Memory Management and PYTORCH_CUDA_ALLOC_CONF
Thu Dec 23 16:51:39 2021
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 460.27.04 Driver Version: 460.27.04 CUDA Version: 11.2 |
|-------------------------------+----------------------+----------------------+
| GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. |
| | | MIG M. |
|===============================+======================+======================|
| 0 GeForce RTX 3090 Off | 00000000:02:00.0 Off | N/A |
| 43% 49C P2 140W / 350W | 18322MiB / 24268MiB | 21% Default |
| | | N/A |
+-------------------------------+----------------------+----------------------+
| 1 GeForce RTX 3090 Off | 00000000:03:00.0 Off | N/A |
| 60% 60C P2 321W / 350W | 22951MiB / 24268MiB | 99% Default |
| | | N/A |
+-------------------------------+----------------------+----------------------+
| 2 GeForce RTX 3090 Off | 00000000:82:00.0 Off | N/A |
| 61% 61C P2 325W / 350W | 23130MiB / 24268MiB | 98% Default |
| | | N/A |
+-------------------------------+----------------------+----------------------+
| 3 GeForce RTX 3090 Off | 00000000:83:00.0 Off | N/A |
| 30% 39C P2 105W / 350W | 19942MiB / 24268MiB | 0% Default |
| | | N/A |
+-------------------------------+----------------------+----------------------+
+-----------------------------------------------------------------------------+
| Processes: |
| GPU GI CI PID Type Process name GPU Memory |
| ID ID Usage |
|=============================================================================|
| 0 N/A N/A 1800637 C .../envs/forPyGCL/bin/python 1827MiB |
| 0 N/A N/A 3955522 C python 16493MiB |
| 1 N/A N/A 3931490 C ...drsum-torch1.8/bin/python 22949MiB |
| 2 N/A N/A 1800637 C .../envs/forPyGCL/bin/python 5457MiB |
| 2 N/A N/A 3931491 C ...drsum-torch1.8/bin/python 17671MiB |
| 3 N/A N/A 1800637 C .../envs/forPyGCL/bin/python 1825MiB |
| 3 N/A N/A 1852217 C python 18115MiB |
+-----------------------------------------------------------------------------+