Efficiency of dataloader and collate for large array-like datasets

I find this part slightly confusing, as there are several options.

My concerns are in this new thread:

The issue, as I see it, is that there are in fact several ways to set up “batch sampling” in the documentation. These include setting up a custom sample function, using the pre-defined batch sampler class also as a custom sampler (both cases have batch_size=None), and finally just setting this batch sampling parameter.

However, just setting batch size and dropout, which is said to be equivalent in the docs, clearly does not return a set of indicies to the dataset, but rather queries single indicies which are then collated.

I vow to try all these options on my problem today or tomorrow, and report back in detail what produces the fastest performance when reading small, medium and large batches from a HDF5 database.

In particular, I am unsure what is better: Using a custom batch sampler and a map style dataset, or using an iterative dataset, where the iteration is determined by a custom batch size.
This is what I want to test - or find out if anyone knows…