Pytorch with low GPU util


I am training an RNN model on GPU, but I find the GPU-util is only as low as 20%. I browsed some of the earlier posts, some of which point out that the Dataloader maybe a bottleneck. However, I preload my whole dataset to GPU, so I guess it’s not a problem, is it?

I am doing hyperparameter tuning. I generate 50 models with different optimizers. During each epoch, I train all the 50 models one after another. I use a dictionary to store the (address of) models and optimizers. Is there a bottleneck in these tasks? I notice my CPU usage is always 100% so there might be a bottleneck on the CPU side but I really didn’t find anything.

Could anyone help me on this problem? Or could anyone suggest any methods to debug?

Many Thanks!

BTW, I am generating a random number in the forward step of the network, which I guess may be using cpu power. Is there a way to generate a random number on GPU?