Training a model on half of the available GPU's

nk00 · March 23, 2019, 3:45pm

Hello,

I’m currently training my net on a single GPU. However, as I’m starting to use more complex architectures, I’d like to use more than one GPU. I’m running my code on a machne with 4 GPU’s but I can only use 2 of them at a time (I’m sharing resources with one other person).

When I want to use one GPU, I simply say device = torch.device(“cuda:3” if torch.cuda.is_available() else “cpu”). I’m aware of the Data Parallelism tutorial but it seems that they use all GPU power available and have no restrictions.

So my question is - how do I tell Pytorch to use GPU’s number 2 & 3? Is there a way to say something like torch.device(“cuda:2, cuda:3” …) ?

Thanks,
NK

LeviViana · March 24, 2019, 12:02am

The argument device_ids from DataParallel are there to take care of the devices for you. Otherwise, you can try to define the environment variable CUDA_VISIBLE_DEVICES=2,3 to make sure that your process will never see the other devices.