Hi all,
Is there any cuda functions that automatically choose one of free gpus?
I have multi-gpus and I allocate one gpu per one program. so everytime I find available gpus using nvidia-smi
, and then I choose one using like os.environ[CUDA_VISIBLE_DEVICES]='0'
This is quite uncomfortable, so I want to cuda functions that automatically find free gpu_num and use that.
def get_free_gpu():
os.system('nvidia-smi -q -d Memory |grep -A4 GPU|grep Free >tmp')
memory_available = [int(x.split()[2]) for x in open('tmp', 'r').readlines()]
return np.argmax(memory_available)
guess this will help.
@kelam_goutam Thanks for reply!
If I already running on gpu:0 and Memory-usage is 100MiB / 10000MiB, this code not allocate on gpu:0 ?
What is the criterion of ‘memory_available’.
@J_Na I had the same problem and created a package available on github that can parse the output of nvidia-smi automatically and returns a list of all unused gpus. The gpus are filtered in python (not by nvidia-smi) so you could redefine “available” to mean whatever criterion you prefer.
You can use this to figure out the GPU id with the most free memory:
nvidia-smi --query-gpu=memory.free --format=csv,nounits,noheader | nl -v 0 | sort -nrk 2 | cut -f 1 | head -n 1 | xargs
So instead of:
python3 train.py
You can use:
CUDA_VISIBLE_DEVICES=$(nvidia-smi --query-gpu=memory.free --format=csv,nounits,noheader | nl -v 0 | sort -nrk 2 | cut -f 1 | head -n 1 | xargs) python3 train.py
You can also add it into your .bashrc
then you can directly run python3 train.py
:
alias python3='CUDA_VISIBLE_DEVICES=$(nvidia-smi --query-gpu=memory.free --format=csv,nounits,noheader | nl -v 0 | sort -nrk 2 | cut -f 1 | head -n 1 | xargs) python3'
1 Like