What is the reason for GPU inefficiency？

MartinZhang · December 21, 2021, 9:00am

Today，I trained a model on the GPU.

When I check the GPU, the utilization rate is 20%-60%. And it’s not stable.

Please give me some advise that I can check them one by one.

Thanks a lot.

Andrea_Rosasco · December 21, 2021, 11:04am

One reason might be that your mini-batch size is too small. If you increase the batch size the utilization should increase as well.

Mind that varying the mini-batch size can affect generalization and increasing it decreases the number of optimization steps you are doing in an epoch.

Another reason might be a slow preprocessing of the data. In that case, you can increase the data loader number of workers or try to speed up your preprocessing.

MartinZhang · December 22, 2021, 6:21am

Thank you for your adivce. I will try it. @Andrea_Rosasco