Reduce Idleness Between Batch Loads

12 workers supply the GPU with data. When doing inference, the GPU is shows:

  • 100% GPU utilisation
  • all GPU RAM is used

HTop shows more-or-less that:

  • none of the CPUs are bottlenecked.

But in between batch loading, there are moments when the GPU % drops to zero.

I suspect this is the main reason for low GPU utilisation.

Is there any way to reduce this idleness?

I have many more avenues to improve performance, but this seems blindingly obvious, but I am unsure how to improve this.

Thanks!

Can you describe the structure of the batch and the size of the tensors?
I assume you have memory pinning enabled and non_blocking=True?

Hi,

Here are some more details:

Dataloader:

image

Collate function:

image

non_blocking etc.:

image

img size (actually the stack of images): torch.Size([2010,3,256,256])

OK, I realised I didn’t read the workers argument correctly! All good now.

:blush:

1 Like