As I see it there is a parameter shuffle in the python constructor for a dataloader but not in the C++ version. How can I shuffle my samples in C++?
Note that by default (at least in Libtorch 1.1) the dataset will be shuffled.
You can change that behaviour by specifying a sampler (from torch::data::samplers::[your sampler of choice].
To construct a dataloader with a sequential sampler:
auto data_loader = torch::data::make_data_loader<torch::data::samplers::SequentialSampler>(std::move(dataset), batch_size);
Thanks for your reply. I guess using a RandomSampler is just what I need in my case.
When you say the dataset will be shuffled does that mean once upon creation?
I had done something similar before (can’t remember why though).
I had made my own loader inheriting from dataloader who returned an iterator that I also made myself, inheriting from the normally returned iterator.
At that iterator’s next() method, I caught the stopiteration exception and then reshuffled indices for the next iteration over the entire dataset.
Plus that, add a shuffling in the initialization and you’re good to go.
Something like this in Python
class _DataLoaderIterWrapper(_DataLoaderIter):
def __next__(self):
try:
super().__next__()
except StopIteration:
# DO STUFF
raise StopIteration
class DataLoaderWrapper(DataLoader):
def __iter__(self):
return _DataLoaderIterWrapper(self)