- Why
DataLoader
has bothsampler
as well asbatch_sampler
? while there is already a sampler that istorch.utils.data.BatchSampler
? does that mean I can pass BatchSampler only asbatch_sampler
of DataLaoder? - If I want a
DistributedSampler
that is also in batches/BatchSampler
and is random/RandomSampler
I will have no option to mix them!
bothDistributedSampler
andRandomSampler
wrap a dataset/data_source and onlyBatchSampler
can wrap one of them. - Even if we could mix them, what would be difference of
RandomSampler(BatchSampler(DistributedSampler(dataset)))
andBatchSampler(RandomSampler(DistributedSampler(dataset)))
. - So I think I should use
sampler=DistributedSampler(dataset)
and pass abatch_size
toDataLoader
(or usesampler=BatchSampler(DistributedSampler(dataset))
). But how can I make it random?shuffle
is mutually exclusive fromsampler
and I cannot mixRandomSampler
withDistributedSampler
.
Looking at the code DistributedSampler(RandomSampler(...))
is valid, dataset
does not have to be a Dataset
, it can be another Sampler
because only len()
matters that any Sampler
provides.