Low GPU Utilization while training, should i buy better CPU?

Samuel_Bachorik · August 11, 2021, 4:38pm

Hi I bought new GPU - RTX 3060 12GB. While training on 480x640 RGB images (1200 images) with batch 32. I have very low GPU utilization.
As you can see on this image,
CPU Intel I5 10 th gen 6 core 3.8 ghz - Ulitization 98-100%
I have 16 GB RAM 3200 Mhz - 14GB of 16 filled
GPU - 11 of 12 GB memory filled and Ulitization only about 5%

My question -, is CPU or PC RAM limiting my GPU ? Is it possible to Utilize this GPU more and speed up training ? Should I buy new CPU or more memory ? ?

Nvidia SMI

tom · August 11, 2021, 5:59pm

Probably first look at why your data loading is slow by benchmarking which part causes it. Moving preprocessing to the GPU often is crucial, as is using a fast storage medium (ie SSD, not HDD). But it’s not optimization until you do your own perf measurements…

Best regards

Thomas

Samuel_Bachorik · August 11, 2021, 6:12pm

Thank you for your help but please can you give me some specific TIP what to do ? My images loader is from internet so it should be good. Only PIL, numpy and os used there.

Look at why your data loading is slow

How do you know my data loading is slow ? Or what do you mean by that ?
I am using M2 SSD.

tom · August 11, 2021, 6:19pm

So dataloading to me here is everything inside the dataset and dataloader. Reading from disk, dataset augmentation etc.
You can time it by running an epoch with just the loop and no model, loss, or backward. Then you can comment out bits and compare or use time.perfcounter or so for timing (or a profiler would work, too, of course).

Samuel_Bachorik · August 12, 2021, 6:36am

Hi @tom, I made test and this is result.

just for info-
When I am loading photos every loop and pushing them to Model training the training lasts 5 hours(test on 700 images).

Test 1: - Still slow
I deleted from training loop model, backward, loss, and left there only images loader, (no GPU work) - “training” lasts 4 hours.

Test 2: – Extremely fast
I deleted from training loop images loader and I loaded batch only once before training loop. So GPU was working with only one batch whole training. And training lasts only 1 hour and GPU Utilization was 100% all time.

From this we know, loading of images is slowing down whole training. When CPU is loading images for so long GPU does nothing.

Can you now confirm or refute if this is weak CPU or bad images loader optimization ?

If you want to see this is images loader -PyTorch-AI-road-detection-classification/Model_loader.py at main · Samuel-Bachorik/PyTorch-AI-road-detection-classification · GitHub

tom · August 12, 2021, 8:48pm

I would try to replace the PIL augmentation by one on GPU using torchvision or eg kornia.
Also, you would probably do yourself a favour by using Dataset and Dataloader from PyTorch.