Is there a way to train one model using multiple dataloaders?

ysleer · February 23, 2021, 6:52am

HI,

I am trying to create and use multiple data loaders with different image size and batch size.

For example, you create 5 data loaders and insert them into a list called dataloader_list.

And proceed with learning as follows.

def main():
...

dataloader_list = list()
for i in range(0, 3):
      trainloader = torch.utils.data.DataLoader()
      dataloader_list.append(trainloader)

...

for epoch in range(args.epochs):
   for i in range(0, len(dataloader_list)):
        train(dataloader_list[i], model, ...)

def train(train_loader, model, criterion, criterion_exist, optimizer, epoch, args, rank=None):
    batch_time = AverageMeter()
    data_time = AverageMeter()
    losses = AverageMeter()
    losses_exist = AverageMeter()

    # switch to train mode
    model.train()

    end = time.time()
    for i, (input, target, target_exist) in enumerate(train_loader):
        # measure data loading time
        data_time.update(time.time() - end)

        target = target.cuda()
        target_exist = target_exist.float().cuda()
        input_var = torch.autograd.Variable(input)

        # input_var = input_var.permute(0, 3, 1, 2)

        target_var = torch.autograd.Variable(target)
        target_exist_var = torch.autograd.Variable(target_exist)

        # target_var = target_var.type(torch.LongTensor).cuda()

        # compute output
        output, output_exist = model(input_var)  # output_mid
        loss = criterion(torch.nn.functional.log_softmax(output, dim=1), target_var)
        # print(output_exist.data.cuda().numpy().shape)
        loss_exist = criterion_exist(output_exist, target_exist_var)
        loss_tot = loss + loss_exist * 0.1

        # measure accuracy and record loss
        losses.update(loss.data.item(), input.size(0))
        losses_exist.update(loss_exist.item(), input.size(0))

        # compute gradient and do SGD step
        optimizer.zero_grad()
        loss_tot.backward()
        optimizer.step()

        # measure elapsed time
        batch_time.update(time.time() - end)
        end = time.time()

        if (i + 1) % args.print_freq == 0:
            print((
                'Rank: {0} input_shape: {1} Epoch: [{2}][{3}/{4}], lr: {lr:.5f}\t' 'Time {batch_time.val:.3f} ({batch_time.avg:.3f})\t' 'Data {data_time.val:.3f} ({data_time.avg:.3f})\t' 'Loss {loss.val:.4f} ({loss.avg:.4f})\t' 'Loss_exist {loss_exist.val:.4f} ({loss_exist.avg:.4f})\t'.format(
                    rank, input.shape[2:], epoch, i, len(train_loader), batch_time=batch_time, data_time=data_time,
                    loss=losses,
                    loss_exist=losses_exist, lr=optimizer.param_groups[-1]['lr'])))
            batch_time.reset()
            data_time.reset()
            losses.reset()

Of course, it works normally.

What I want to do here is to calculate the average of the values calculated with 5 data loaders and update them again.

Is there any way??

And one more thing you are curious about, each data loader will be activated when executing train . Is there any way to minimize this time?

I think the simple idea is to extract the values of 5 data loaders by running a for loop in advnace. Please comment if this is correct.

Thank you in advance.

ksz · February 23, 2021, 8:56am

I think that in your case a better idea can be to use one data loader but with different data sources. If you write it well you will still be able to control batch_size and images size.

However if you still need to use your idea, maybe better option is use one loop:

for item1, item2 in zip(dataloader1, dataloader2):
    image_batch1, labels1 = item1
    image_batch2, labels2 = item2

If it does not help you, please describe your problem with more detail.

ysleer · February 23, 2021, 9:09am

Thank you for your answer.

But I still have no idea how to use it.

The input of the model is one image and label. How do I train with two items?

KaiHoo · February 23, 2021, 9:28am

You can use prefecher (apex/main_amp.py at e2083df5eb96643c61613b9df48dd4eea6b07690 · NVIDIA/apex · GitHub). Every time you need data from the k-th data loader:
input, target = prefetcher_k.next()