I have a 4-titan XP GPU server. When i use os.environ[“CUDA_VISIBLE_DEVICES”] =“0,1” to allocate GPUs for a task in python, I find that only GPU 0 is used. And there is out of memroy problems even GPU 1 is free.
Should I allocate memory to different GPUs myself?


Use CUDA_VISIBLE_DEVICES (not “DEVICE”). You have to set it before you launch the program – you can’t do it from within the program.


My bad, there is a typo in my post. But in my code, when i use
os.environ[“CUDA_VISIBLE_DEVICES”] =“1,2”
, only GPU 1 is used. At least, such a line in Python has its own effect. It can control the use of GPUs.
However, It is supposed to make GPU 1 and 2 available for the task, but the result is that only GPU 1 is available. Even when GPU 1 is out of memory, GPU 2 is not used. Is there any other switches controlling parallel computing between two GPUs?
BTW, another question: Does Pytorch tend to use GPUs one by one or allocate equal memories to GPUs

You can push your data to a specific GPU using .cuda(gpu_id). E.g. you can load a generator network on one GPU and the discriminator to the other.
Another option is to use the DataParallel module.


Do you mean that if I want to use two GPUs at the same time, i have to change my source codes and add parallel-related codes?
Or it would just use one GPU at most if not specified.

Basically it is just one line to use DataParallel:

net = torch.nn.DataParallel(model, device_ids=[0, 1, 2])
output = net(input_var)

Just wrap your model with DataParallel and call the returned net on your data.
The device_ids parameter specifies the used GPUs.


It doesn’t work for me. Still, only the first GPU is used. Is there any more tricks?


Does data parallel only support more than batch=1? Actually, I only use batch=1.

You need to use >=#gpu batch set to apply data parallel. As its name suggests, data parallel just pushes computation for different data in a batch to different gpus.


Hi! I’m adding os.environ['CUDA_VISIBLE_DEVICES'] = "2" in my code does not work, the code always select first GPU, however CUDA_VISIBLE_DEVICES=2 python works.
I find that os.environ['CUDA_VISIBLE_DEVICES'] in this code is work. Do you know why?

@MrTuo This is how pytorch 0.4.1 convention works. If you say CUDA_VISIBLE_DEVICES=2, 3. Then for pytorch GPU - 2 is cuda:0 and GPU - 3 is cuda:1. Just check your code is consistent with this convention or not?


I had this same issue where setting CUDA_VISIBLE_DEVICES=2 python works but setting os.environ['CUDA_VISIBLE_DEVICES'] = "2" didn’t. The cause of the issue for me was importing the torch packages before setting os.environ['CUDA_VISIBLE_DEVICES'], moving it to the top of the file before importing torch solved it. Hope this helps.


That’s helpful for me, thanks 3000 times

thank you, it works.

Hey, I have the opposite problem: code is using both of my GPUs by default, no matter what I do. These are different GPU models and I DO NOT want to use them for parallel processing. I’ve tried setting GPU #0 with cuda_visible_devices, tried setting it with torch, moved it to beginning of code, nothing is working.

Just for the record, I am doing deep learning object detection importing arcgis and torch. Everything else seems to work fine now, until I try to test learning rate and it tells me my GPUs are imbalanced and that I should exclude GPU #1. I never wanted GPU #1 to be utilized in the first place.

EDIT: Nevermind, it appears to be working now after I did move it towards beginning of code. I guess I reset the kernel somehow, which made it work. I’m just a rookie :stuck_out_tongue:

When I tried this solution (I have two gpu), it shows an error
RuntimeError: Expected all tensors to be on the same device, but found at least two devices, cuda:0 and cuda:1!

I’m not sure which solution you are referring to, but the error could be raised, if you manually specify a device inside the model.
Could you post an executable code snippet, which would reproduce the issue, so that we could debug it, please?

Saved 30 hours of my life. Thanks a ton.

It’s also possible to run into this with bad conda environments.

For me

import os
os.environ["CUDA_VISIBLE_DEVICES"] = "2" # just use one GPU on big machine
import torch
assert torch.cuda.device_count() == 1

Failed, but it was because my environment was problematic, and only

import torch 

actually raised an error.