Hi,
I have 10 GPUs available and 1 GPU is in used by another torch process. I would like to run another process on any of the remaining GPUs but I always get the error message:
RuntimeError: CUDA error: all CUDA-capable devices are busy or unavailable
I tried several options and none of them worked:
- Selecting GPUs on python script
import os
os.environ[“CUDA_DEVICE_ORDER”] = ‘PCI_BUS_ID’
os.environ[“CUDA_VISIBLE_DEVICES”] = ‘0,1,3,4’
print(f’[INFO] Using GPU: {torch.cuda.current_device()}‘)
print(f’[INFO] Available GPUs: {torch.cuda.device_count()}')
for d in range(torch.cuda.device_count()):
print(torch.cuda.get_device_name(d))
But It only recognise 1 GPU:
[INFO] Using GPU: 0
[INFO] Available GPUs: 1
GeForce GTX 1080 Ti
- Selecting GPU on command line:
CUDA_VISIBLE_DEVICES=0,1,3,4 python gdxray_cganTrainer.py
and I check on linux using
env | grep CUDA_VISIBLE_DEVICES
→ CUDA_VISIBLE_DEVICES=1,3,4
None of the above 2 options work
I am using torch 1.8.0 and python 3.8.8
Any help?