Number of data points of DataLoader with SubsetRandomSampler

We have a DataLoader that is using SubsetRandomSampler. Is it possible to tell the number of datapoints in the loader (which are sampled) from the DataLoader?

You can print the len of the internal .sampler as seen here:

N = 100
dataset = torch.utils.data.TensorDataset(torch.randn(N, 1))
sampler = torch.utils.data.sampler.SubsetRandomSampler(indices=torch.arange(N//2))
loader = torch.utils.data.DataLoader(dataset, sampler=sampler, batch_size=7)

print(len(loader))
# 8
# 7 full batches and 1 batch with a single element: 7*7+1*1 = 50

print(len(loader.sampler))
# 50 number of samples
1 Like

Perfect, it seems that it’s even working when there is no sampler and return the value of the whole dataset.

You’re the best, @ptrblck