Calls to almost all CUDA functions are causing an out of memory error:
In [2]: torch.cuda.is_available()
Out[2]: True
In [3]: torch.cuda.device_count()
Out[3]: 2
In [4]: torch.cuda.device(1)
Out[4]: <torch.cuda.device at 0x7f0024dc7668>
In [5]: torch.cuda.device(0)
Out[5]: <torch.cuda.device at 0x7f0024dc7f98>
nvidia-smi:
Sat Nov 4 20:32:12 2017
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 375.82 Driver Version: 375.82 |
|-------------------------------+----------------------+----------------------+
| GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. |
|===============================+======================+======================|
| 0 GeForce GTX 1080 Off | 0000:01:00.0 On | N/A |
| 97% 67C P2 83W / 198W | 7831MiB / 8105MiB | 94% Default |
+-------------------------------+----------------------+----------------------+
| 1 GeForce GTX 1080 Off | 0000:02:00.0 Off | N/A |
| 34% 56C P2 49W / 200W | 6172MiB / 8114MiB | 25% Default |
+-------------------------------+----------------------+----------------------+
+-----------------------------------------------------------------------------+
| Processes: GPU Memory |
| GPU PID Type Process name Usage |
|=============================================================================|
| 0 702 G /usr/lib/xorg-server/Xorg 169MiB |
| 0 942 G /usr/bin/gnome-shell 128MiB |
| 0 1743 G ...el-token=CD354235E476D5C9CE534143076E615F 45MiB |
| 0 6863 C python 287MiB |
| 0 14361 C python 1985MiB |
| 0 14705 G /usr/bin/nvidia-settings 0MiB |
| 0 31964 C python 5211MiB |
| 1 702 G /usr/lib/xorg-server/Xorg 7MiB |
| 1 6863 C python 6159MiB |
| 1 14705 G /usr/bin/nvidia-settings 0MiB |
+-----------------------------------------------------------------------------+
Its weird given there are other training sessions running on the two GPUs, which are not OOM-ing, and there is enough free memory left!
$ uname -r
4.9.48-1-MANJARO
$ nvcc --version
nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2016 NVIDIA Corporation
Built on Tue_Jan_10_13:22:03_CST_2017
Cuda compilation tools, release 8.0, V8.0.61