Why does PyTorch use significant CPU when using to(device) and DataLoader(num_workers=1)?

eugene · July 26, 2018, 11:05am

I’m puzzled when I notice my PyTorch script using 600% CPU (viewed in top) when I’ve set tensors and model to GPU. Here’s what I’ve checked with .is_cuda:

Input sequence tensors
Input sequence length tensors
Target tensors
PackedSequence tensors
Initialized hidden tensors
Output tensors
Output hidden tensors
Loss

Also, my data loaders have num_workers = 1.

The GPU is being used (viewed in nvidia-smi -l 3)

Am I missing something? Why is there still so much CPU being used? I’m wondering if my pipeline is not working as intended.

tugui · August 1, 2018, 7:36am

There may be other things that becomes the bottleneck of training process such as disk IO if the dataset is large, so the GPU is idle all the time.