Programmatically check if PyTorch is using a GPU?

crossjbeer · May 7, 2021, 7:20pm

Hi All,

Apologies if this is a repeat question. I have searched around for some time and I am not able to find any resources.

My Situation:
I am attempting to create a CRON job which will automatically engage a Neural Network building pipeline. I have a work station with 2 GPU installed. I would like a python script to detect when either GPU is being used, and begin building the model on one that is not being used.

What I have found so far:
I have done a bit of searching, and keep coming back to people’s mention of the ‘torch.cuda.is_available()’ function, as mentioned in this link [python - How to check if pytorch is using the GPU? - Stack Overflow]. However, as is also mentioned in that link, this function doesn’t tell us whether or not the GPU is currently building a model. Rather, it simply tells us whether a GPU exists at all.

My Question:
Is there a way to programmatically check whether or not a PyTorch model is building on a GPU (or anything analogous) from within a Python script?

Thanks a lot and have a nice day!

eqy · May 7, 2021, 7:36pm

Simply checking whether a GPU is “used” might be dangerous as it might be a race with something else that is contending for a GPU. However, if you are confident about the scheduling of jobs, you can try something like nvidia-smi --query-compute-apps=pid,process_name,used_memory,gpu_bus_id --format=csv.

crossjbeer · May 7, 2021, 8:47pm

Thank you for your response Eqy. I appreciate you taking the time out.

I think this command should be by far the easiest output of nvidia-smi to parse in a python script. Would you mind elaborating on what you mean when you say “it might be a race with something else that is contending for a GPU”?

Have a nice day

eqy · May 7, 2021, 8:58pm

For example, if the intent is to avoid stomping over a user’s job that is running on the same system, this method is dangerous because a user can start a job just after nvidia-smi polls the GPUs. So a GPU might “look” idle when it has in fact just been used.

crossjbeer · May 10, 2021, 4:07pm

This is a great point. Luckily, I am the only one who uses the system. I believe, unless I do anything manually, the scheduling script will be the only thing making use of the GPU at any given time.

Your command is parsing well, and it seems you have solved my problem! Thanks a lot.