Low GPU Util with Custom Dataloader Open CV and Numpy Preprocessing

You could profile the data loading using the ImageNet code and check if that’s really the bottleneck.
If so, you might take a look at this post, which explains potential bottlenecks in the data loading pipeline.