How to change the default device of GPU? device_ids[0]


(Lynnea718) #1

How to change the default device of GPU?
for some reason ,I can not use the device_ids[0] of GPU,
I change the following code:(in data_parallel.py

if output_device is None:
output_device =device_ids[0]

to

if output_device is None:
output_device =device_ids[1]

but it still seem to used the device_ids[0]
all tensors must be on devices[0]
How to change it?


(Francisco Massa) #2

I think the best way is to directly specify with the CUDA_VISIBLE_DEVICES environment variable.
Something like

CUDA_VISIBLE_DEVICES=1,2 python myscript.py

so your script will only see GPUs number 1 and 2, and won’t touch the other GPUs


(Lynnea718) #3

i try add
CUDA_VISIBLE_DEVICES=1,2
in main.py
but it does not work

We have 8 gpu. I used the devices[1,2] ,the rest is used by others


(colesbury) #4

CUDA_VISIBLE_DEVICES is an environment variable. You set it on the command line, not in the Python script.


#6

So in the Python script, How to choose the GPU devices. Especially, I want to change GPU devices in the training processing


(Uzeful) #7

I also want to know how to choose the GPU device in the python script.


(Jing) #8

torch.cuda keeps track of currently selected GPU, and all CUDA tensors you allocate will be created on it. The selected device can be changed with a torch.cuda.device context manager.

ex:

with torch.cuda.device(1):
    w = torch.FloatTensor(2,3).cuda()
    # w was placed in  device_1 by default.

Or you can specify gpu.id via .cuda() directly.

w = torch.FloatTensor(2,3).cuda(2)
# w was placed in device_2

See more on: http://pytorch.org/docs/master/notes/cuda.html


(Uzeful) #9

Thanks for your reply, I will do some experiments to verify these functions. Besides, I found out some other useful functions at How to specify GPU usage?.


(jdhao) #10

I have tried to set CUDA_VISIBLE_DEVICES in shell, then I run a simple script to test if the setting has taken effect, unfortunately, it seems does not work.

CUDA_VISIBLE_DEVICES=3; python test.py

the script test.py is

import torch
print(torch.cuda.current_device())

the above script still shows that current device is 0.

I find that torch.cuda.set_device(device_num) works fine in setting the desired GPU to use.


[ Solved] nn.DataParallel with ModuleList of custom modules fails on Multiple GPUs
(Simon Wang) #11

two things you did wrong:

  1. there shouldn’t be semicolon. with the semicolon, they are on two different lines, and python won’t see it.
  2. even with the correct command CUDA_VISIBLE_DEVICES=3 python test.py, you won’t see torch.cuda.current_device() = 3, because it completely changes what devices pytorch can see. So in pytorch land device#0 is actually your device#3 of the system. You can verify that from nvidia-smi.

Torch.cuda.set_device is ignored by torch.load
(jdhao) #12

Thanks for the information. I thought that PyTorch would print the actual GPU id even if we use CUDA_VISIBLE_DEVICES to set available GPU.


(Simon Wang) #13

That controls what devices CUDA exposes and PyTorch can’t do nothing in this regards.


(Hengshuang Zhao) #14

Hi, you can specify used gpu in python script as following:

import os
from argparse import ArgumentParser

parser = ArgumentParser(description=‘Example’)
parser.add_argument(’–gpu’, type=int, default=[0,1], nargs=’+’, help=‘used gpu’)

args = parser.parse_args()
os.environ[“CUDA_VISIBLE_DEVICES”] = ‘,’.join(str(x) for x in args.gpu)


(jianzhonghe) #15

it cannot work for me, it always use the first(ie, 0) gpu


(Kaiyu Yue) #16

Thanks, this is the easiest way to solve this problem.


(Simon Wang) #17

It shouldn’t happen. That is a CUDA flag. Once set, PyTorch will never have access to the excluded device(s).


(jianzhonghe) #18

I change the place of that and it worked,thanks for your reply!

获取 Outlook for iOShttps://aka.ms/o0ukef


(Moonlightlane) #20

or torch.cuda.set_device(device_id)


(Maplewizard) #21

This is a very useful solution, especially you are going to run with someone else’s code without specifying the cuda id.