How to get the same GPU index order as nvidia-smi?

I work with a GPU server where many of us can use it and we should not use all GPUs for a single user. So I’m trying to do something that will be able to automatically use the GPU with the least memory used by other users.

I saw that it is possible to get the same GPU usage information as with “nvidia-smi” with functions like the package [nvgpu](https://pypi.org/project/nvgpu /) or an example using [subprocess.call to execute the command “nvidia-smi” and place the output in a csv file] (Why the two GPUs on my machine have the same ID, so that Pytorch can only choose one?). So with that, I can know which GPU has the least memory used.

The only trouble is that the order of the devices of PyTorch is not the same as on “nvidia-smi” (can be seen by the order of names in my case). So I’m looking for a way to convert the GPU order from “nvidia-smi” to the one used by PyTorch to properly select the GPU with the least memory used.

Note: In my case, the server has 2 GPU “GTX 1080 Ti” and 1 GPU “TITAN RTX”, so I can see the difference between the names, but not between the 2 GPU “GTX 1080 Ti”.

I found the solution to the problem that has already been solved as presented in the following links:

  1. Order of CUDA devices

  2. How to setting the GPU No. for training?

  3. Gpu devices: nvidia-smi and cuda.get_device_name() output appear inconsistent

The solution is simply to make the first line of the following code BEFORE calling functions related to “torch.cuda” since its initialization is performed only once.

#Change the order so that it is the one used by "nvidia-smi" and not the 
#one used by all other programs ("FASTEST_FIRST")
os.environ["CUDA_DEVICE_ORDER"]="PCI_BUS_ID"

#Check that it is the same order as for "nvidia-smi":
[torch.cuda.get_device_name(i) for i in xrange(0,torch.cuda.device_count())]