[Help needed] Training RNN models on GPU with high CPU usage

AlexisW · September 21, 2018, 6:26pm

I am training some models on GPU, however, it seems like the current (runtime) performance is limited by CPUs. I got something like %Cpu(s): 29.3 us, 54.2 sy, ... in top. Is there any common reason for this?

AlexisW · September 22, 2018, 6:13pm

Any thoughts for this will be appreciated

Julian_Medina · September 22, 2018, 6:15pm

What’s your current hardware configuration? [CPU, GPU]

AlexisW · September 22, 2018, 6:17pm

16 core E5 CPUs with a GTX 1080 Ti GPU; GPU utilization is around 30%

Julian_Medina · September 22, 2018, 6:21pm

Do you mind providing some more information about your model and task?

I think it may be hard to diagnose the issue without it. My initial thought is that some processing task each time is maxing out one of the CPU threads and is bottlenecking the model but I can’t say without knowing the model and task and how it’s implemented. I’m not sure if someone else has any other thoughts?

AlexisW · September 23, 2018, 1:47am

I am doing a RNN model. I got a customized dataloader, which takes a string and return the padded sequence and its original length (i.e., “abc” => [1,2,3,0,0], 3 when padding to length 5). And these data are feed into a lstm model with sequence packing.

Also, if it is a CPU job, shouldn’t the CPU util be around 100% with a small sys?

AlexisW · September 24, 2018, 3:44am

Hi Julian,
Do you have any further ideas for this?
Thanks!