Custom shuffle function with DistributedSampler

I have a dataset class which reads samples.
Sample attributes: (domain, image, label)

In each batch, I want to enforce that every single domain is represented. For example, 8 domains would warrant a batch size of 8K with K samples/domain.

I have written a custom shuffle function which sorts the entire pool of images and rearranges the index such that each batch will have K samples from each domain.

This sorting occurs as train_loader.dataset.sort('my_custom_shuffler') at every reset point (like an epoch or some other logic). To ensure that train_loader does not jumble my list again, train_loader.shuffle has been set of False.

In my application, DistributedSampler(shuffle=False) has been defined and assigned to train_loader before the train looping starts. Will my manual sorting affect DistributedSampler (DS)? Does DS assign samples to each rank in the very beginning and maintain that order? If yes, then manual shuffling should not affect DS operation.

cc @VitalyFedyunin @glaringlee for DataLoader and DataSampler