A problem about the length of DataLoader

zja_torch · February 18, 2019, 1:53pm

Hi all, I ecountered a curious problem. A pytorch code looks like

for batch_idx, (imgs, pids, _) in enumerate(trainloader):
print(batch_idx, len(trainloader))

The problem is that the batch_idx variable cannot reach len(trainloader). Namely, it cannot go through all the data.

Could some tell me how it occurs?

Note that, I test a new dataset, which is normal. However, I check the dataset related code, I cannot find what causes this problem.

ptrblck · February 18, 2019, 2:21pm

What is the difference between the last batch_idx and len(trainloader)?
Note that the last index would be (len - 1), as Python uses 0-based indexing.

zja_torch · February 19, 2019, 4:31am

QQ%E6%88%AA%E5%9B%BE20190219122512
Just like this figure shows, the second number is the batch_idx, and the third number is len(trainloader). It prints the result every 10 batch_ids. Besides, the code is print(batch_idx+1) instead of print(batch_idx). However, as we can see, at least 25 (595-570) batch_ids are not printed. It is quite confusing.

ptrblck · February 19, 2019, 6:19am

Could you post the line of code printing this output?
Also, what is your batch size and how many samples does your Dataset contain?

zja_torch · February 19, 2019, 6:37am

Yes, the code is shown as
for batch_idx, (imgs, pids, _) in enumerate(trainloader): ‘’'some train code if (batch_idx + 1) % args.print_freq == 0: print(‘Epoch: [{0}][{1}/{2}]\t’ ‘Time {batch_time.val:.3f} ({batch_time.avg:.3f})\t’ ‘Data {data_time.val:.4f} ({data_time.avg:.4f})\t’ ‘Loss {loss.val:.4f} ({loss.avg:.4f})\t’.format( epoch + 1, batch_idx + 1, len(trainloader), batch_time=batch_time, data_time=data_time, loss=losses))

and the trainloader is defined as
trainloader = DataLoader( ImageDataset(dataset.train, transform=transform_train), sampler=RandomIdentitySampler(dataset.train, args.train_batch, args.num_instances), batch_size=args.train_batch, num_workers=args.workers, pin_memory=pin_memory, drop_last=True, )
The batchsize is set to be 128, and there are about 150k samples in dataset.