PyTorch only using one GPU?

Kalamaya · January 29, 2017, 8:12pm

From nvidia-smi, I can see that during training, my pyTorch script is only using one GPU. Is there a way to make it use the others? (I have two). Thanks.

apaszke · January 29, 2017, 8:32pm

Sure, but keep in mind the way to use multi-GPU depends on the application. You might be interested in the DataParallel module, or you can spread it over multiple GPUs by passing an additional argument to .cuda() call - .cuda(1) will place the tensor/module on 2nd GPU.

Kalamaya · February 1, 2017, 2:42am

Hi @apaszke, thanks, but just a few questions - I currently have a class where I define my net: So I have:

class DNN(nn.Module):

def __init__(self):
super(DNN, self).__init__()

//stuff

def forward(self, input):

Now in my main script, I call:

myDNN = DNN()
myDNN.cuda()

I took a look at the documentation for the DataParallel, and what I currently do is this:

myDNN = torch.nn.DataParallel(DNN(), device_ids=[0, 1, 2])

Now this seems to run, BUT, it complains that it doesnt know what “forward” is. (The forward member function that it).

So I am confused as to what exactly I should be putting through this “DataParallel” exactly… every member function of my DNN?..

Thanks

apaszke · February 1, 2017, 11:07am

No, you only wrap a single module in DataParallel and it will parallelize the whole subtree. Not sure what the error is, but it’s likely a bug in your model. I can’t help much without the message and stack trace.