Os.environ ["CUDA_VISIBLE_DEVICES"] not functioning

I have a total of 4 GPUs. I want to use the gpu no. 2 for my experiments. on the top of the code I set
os.environ["CUDA_VISIBLE_DEVICES"]='2' but I see that I am still using GPU no. 0.

Also torch.cuda.device_count() returns 4 to me. How can I fix it?

maybe this helps:

device = torch.cuda.device(2)

as described here
can be checked using torch.cuda.current_device()

or

device = torch.device('cuda:2')

as described here

and here is a overview of cuda semantics

please keep 0-indexing in mind. meaning cuda:2 is your third cuda device, not your second

Thanks for the quick reply. But that is not possible for me as I have another script which which tells which gpus are free and returns me a string for eg. '2,3'. which convenient to be passed to the os.environ["CUDA_VISIBLE_DEVICES"] but not possible in the other styles you mentioned

Hi,

Two things:

  • The CUDA_VISIBLE_DEVICES environment variable is read by the cuda driver. So it needs to be set before the cuda driver is initialized. It is best if you make sure it is set before importing torch (or at least before you do anything cuda related in torch).
  • The device numbers within your program will always have ids starting at 0 and going up. Even if you mask to only see devices 2 and 3. From within your program they will have number 0 and 1.

Thanks! the point one solved my issue.
I had imported a file where the cuda device was getting initialized. I set the os.environ["CUDA_VISIBLE_DEVICES"] on the very top and it functioned as usual.