Model.to(device) works on cuda:0, but don't work on 1, 2 etc

When I transfer my model to cuda:0:
model = model.to(device=torch.device("cuda:0"))

Everything is fine, but when I try to do the same for torch.device("cuda:1"), I got:

RuntimeError: CUDA error (10): invalid device ordinal

What I tried:

  1. os.environ[“CUDA_VISIBLE_DEVICES”] = “1”
  2. export CUDA_VISIBLE_DEVICES=1 and then python my_program.py
  3. CUDA_VISIBLE_DEVICES=1 python my_program.py

However, I’d like to notice that print(torch.cuda.current_device()) give me 0 always.
Also, I use conda env. Can conda be a reason of issue?

If you set CUDA_VISIBLE_DEVICES = 1 your python script is only able to “see” this GPU and thus refers to it as “cuda:0” . If you start it on cuda:0 with CUDA_VISIBLE_DEVICES being set to 1 and afterwards use nvidia-smi you should see your model running on GPU1

1 Like