From nvidia-smi, I can see that during training, my pyTorch script is only using one GPU. Is there a way to make it use the others? (I have two). Thanks.
Sure, but keep in mind the way to use multi-GPU depends on the application. You might be interested in the DataParallel module, or you can spread it over multiple GPUs by passing an additional argument to
.cuda() call -
.cuda(1) will place the tensor/module on 2nd GPU.
Hi @apaszke, thanks, but just a few questions - I currently have a class where I define my net: So I have:
def __init__(self): super(DNN, self).__init__()
def forward(self, input):
Now in my main script, I call:
myDNN = DNN()
I took a look at the documentation for the DataParallel, and what I currently do is this:
myDNN = torch.nn.DataParallel(DNN(), device_ids=[0, 1, 2])
Now this seems to run, BUT, it complains that it doesnt know what “forward” is. (The forward member function that it).
So I am confused as to what exactly I should be putting through this “DataParallel” exactly… every member function of my DNN?..
No, you only wrap a single module in
DataParallel and it will parallelize the whole subtree. Not sure what the error is, but it’s likely a bug in your model. I can’t help much without the message and stack trace.