DistributedDataParallel and SubsetRandomSampler

I am currently using SubsetRandomSampler to enforce a train-val split on my custom dataset, which works well on my current single-GPU configuration. However, in anticipation of moving to training on multiple nodes and GPUs, I wanted to see if it’s possible to “wrap” the splits created by SubsetRandomSampler somehow such that within my train split, I can replicate the functionality of DistributedSampler.

If not – what alternatives do I have for creating a train-val split? Must I create separate Dataset objects for the train and the val set?

cc @vincentqb for dataloader question. :slight_smile: