Dataloader is more than 4x slower

I see, thanks! That explains why num_workers does not help in my scenario since all the data is already in memory.

From this post it looks like it will not shard the data and iterate over the entire dataset if you don’t pass a DistributedSampler :frowning: