About torch.utils.data.Dataloader

Hi could someone tell me the difference between len(dataloader) and len(dataloader.dataset) thanks

The length of the DataLoader gives you the number of batches, while the length of the Dataset is usually defined as the number of samples.
If you are writing a custom Dataset, you are setting the length in the __len__ method, so that’s why I said “usually”. :wink:

1 Like

thanks a lot i confirm your answer by printing len(…)
en(trainloader) is the number of batches while len(train_loader.dataset) is the number of all training set which is equivalent to number of batches * batch size

num_batches * batch_size = num_samples would only be true, if each batch is full.
Otherwise the last batch might be smaller. :wink:
PS: if you want to get rid of the last potentially smaller batch, you can specify drop_last=True in your DataLoader.