I have a total of 4 GPUs. I want to use the gpu no. 2 for my experiments. on the top of the code I set os.environ["CUDA_VISIBLE_DEVICES"]='2' but I see that I am still using GPU no. 0.
Also torch.cuda.device_count() returns 4 to me. How can I fix it?
Thanks for the quick reply. But that is not possible for me as I have another script which which tells which gpus are free and returns me a string for eg. '2,3'. which convenient to be passed to the os.environ["CUDA_VISIBLE_DEVICES"] but not possible in the other styles you mentioned
The CUDA_VISIBLE_DEVICES environment variable is read by the cuda driver. So it needs to be set before the cuda driver is initialized. It is best if you make sure it is set before importing torch (or at least before you do anything cuda related in torch).
The device numbers within your program will always have ids starting at 0 and going up. Even if you mask to only see devices 2 and 3. From within your program they will have number 0 and 1.
Thanks! the point one solved my issue.
I had imported a file where the cuda device was getting initialized. I set the os.environ["CUDA_VISIBLE_DEVICES"] on the very top and it functioned as usual.
Hi Pal, I’ve been trying to get this going as well but unfortunately I am faced with limited success.
I cannot get the GPU to run nevertheless the multiple commands I’ve tried.
Can you guide me a bit, you seem to have a better understanding than I do… I am learning by trial and error and it’s tedious and painful. I’m learning but not getting to where I’d like to be.