Is the dataloader parallelizing __getitem__ or batch sampling?

The second case is currently used, i.e. each worker creates a full batch and pushes it to the queue.
Once it’s ready it’ll start creating the new batch until prefetch_factor*num_workers batches are available.

1 Like