Understanding DataLoader performance

I figured this out myself. My targets were stored in a SparseTensor and the recommendations from here: Dataloader loads data very slow on sparse tensor - #4 by drj3122 helped me resolve the problem.

Essentially grabbing a batch from the SparseTensor scaled with the size of that Tensor and thus more training data mean slower micro batches.