The length of the loader will adapt to the batch_size
. So if your train dataset has 1000 samples and you use a batch_size
of 10, the loader will have the length 100.
Note that the last batch given from your loader can be smaller than the actual batch_size
, if the dataset size is not evenly dividable by the batch_size
. E.g. for 1001 samples, batch_size
of 10, train_loader
will have len(train_loader)=101
and the last batch will only contain 1 sample. You can avoid this by setting drop_last=True
.
class MyDataset(Dataset):
def __init__(self, size):
self.x = torch.randn(size, 1)
def __getitem__(self, index):
return self.x[index]
def __len__(self):
return len(self.x)
dataset = MyDataset(1001)
data_loader = DataLoader(dataset,
batch_size=10)
len(data_loader)
for batch_idx, data in enumerate(data_loader):
print 'batch idx{}, batch len {}'.format(
batch_idx, len(data))
data_loader = DataLoader(dataset,
batch_size=10,
drop_last=True)
len(data_loader)
for batch_idx, data in enumerate(data_loader):
print 'batch idx{}, batch len {}'.format(
batch_idx, len(data))