Unable to select GPU

robotechnics · September 26, 2022, 10:50am

Hi,
I have 10 GPUs available and 1 GPU is in used by another torch process. I would like to run another process on any of the remaining GPUs but I always get the error message:
RuntimeError: CUDA error: all CUDA-capable devices are busy or unavailable

I tried several options and none of them worked:

Selecting GPUs on python script
import os
os.environ[“CUDA_DEVICE_ORDER”] = ‘PCI_BUS_ID’
os.environ[“CUDA_VISIBLE_DEVICES”] = ‘0,1,3,4’
print(f’[INFO] Using GPU: {torch.cuda.current_device()}‘)
print(f’[INFO] Available GPUs: {torch.cuda.device_count()}')

for d in range(torch.cuda.device_count()):
print(torch.cuda.get_device_name(d))

But It only recognise 1 GPU:
[INFO] Using GPU: 0
[INFO] Available GPUs: 1
GeForce GTX 1080 Ti

Selecting GPU on command line:
CUDA_VISIBLE_DEVICES=0,1,3,4 python gdxray_cganTrainer.py

and I check on linux using
env | grep CUDA_VISIBLE_DEVICES
→ CUDA_VISIBLE_DEVICES=1,3,4

None of the above 2 options work

I am using torch 1.8.0 and python 3.8.8

Any help?