Role of shuffle=True while Iterating Dataloader Object

shuffle ( bool , optional ) – set to True to have the data reshuffled at every epoch (default: False ).

Here what is the meaning of epoch?

If we do like:

test_loader = torch.utils.data.DataLoader(dataset, batch_size=10, shuffle=True, **kwargs)
for data, target in test_loader:
process(data, target)

Do we get shuffled batch in every iteration?

Hello,

According to the source we can see that, when shuffle=True the Dataloader initialize a RandomSampler.
And in RandomSampler, it returns

return iter(torch.randint(high=n, size=(self.num_samples,), dtype=torch.int64).tolist())

where n and self.num_samples are equal to len(self.data_source). So in the case you mentioned, we will get shuffled batch in every iteration and its order is fixed when you first call it in the for loop.

1 Like