A few beginner-level questions to help move from CPU to GPU. I’ve searched previous responses here but couldn’t find specifics.
I have my code up and running in my local GPU --only one device (for any other beginners running across this post, you need to wrap your Variables (target.cuda()
), network (decoder.cuda()
) and criterion (criterion.cuda()
) in cuda, and it obviously needs to be available in your system: physical GPU, drivers and packages nvidia+cuda.
I want to spin a small GPU cluster and run my RNN there, but I have a few questions:
-
Are RNNs benefited from GPU’s?
-
Will code that runs properly in my local GPU run out-of-the-box in a GPU cluster? If not what do I need to be thinking about?
-
Do GPUs help if I’m using a batch of size 1? Or are batches “good”?
-
Do I have to manually allocate / transfer or otherwise keep track of which tensor and other objects go to which device? Or does CUDA/PyTorch figure this out automatically?
-
Do I have to gather anything at the end of the computation? (I’m coming from the Spark world where its a thing sometimes).
-
For small, simpler models (like the one I’m running) CPU and GPU times will be very similar. If I take this model to a GPU cluster, will I see any improvement? Is the efficiency gain proportional only to model complexity? Or will the simple model run faster the more nodes in my cluster?
Many questions! Feel free to answer one only.