is there a tutorial on the use of multiple GPUs? For now I have only seen a small tutorial that says to wrap the module using DataParallel, but is that all one needs to do? Just write the model normally and then call
model = DataParallel(model).cuda()?
In the imagenet example, I have seen the use of distributed sampler when loading the training data. Is that something we need to care about?
There’s a little more nuance to it if you want to control the exact GPUs that you parallelize over. If you look at the doc for DataParallel you’ll see that you can specify device_ids. If you do that, you’ll also want to make sure you load all of your variables onto the same GPU to start with (with your_variable.cuda(device_id=ID). That should be pretty much it.
Yes, basically if you have 3 GPUs but only want to use 2 of them then you’d have to specify which ones you want to use. Otherwise you can call cuda() on the model and the Variables. Do keep in mind that unless you’re running this on a dedicated headless server, one of your GPUs may be tied up displaying your desktop etc. so you might get strange errors.