Built-in mini-batch vs. concatenation of each output

Hi folks,

I happened to not use mini-batch and dataloader in PyTorch as I do some work on multi-instance learning where I need to pass a bag of images with a single label with limited hardware specs. For normal case, I would definitely use the built-in dataloader because it’s probably more stable and efficient. Currently, I’m passing each image in a bag one by one, concatenating features from a backbone, and passing them to a separate binary classifier model.

The question rises when I try to process a “batch of outputs and labels” at the same time to stabilize the learning. So, my model basically processes 4 bags of 32 images sequentially using for loops, combine them as 1x4 tensor. The 1x4 tensors of output and label are then passed to an optimizer function (I know, it is super inefficient) to calculate gradients for the disjointed backbone and binary classifier model.

Would this approach (concatenation of sequential outputs) still mimic the mini-batching?

Hi Hojun,

This would be more like accumulating a gradient of 4 bags of 32 images unless you thought a single batch size would be 4*32. Here is an example of how you would need to do this.

for bag in bags:
    accum_iter = len(bag)
    for inputs, labels in bag:
        # extract inputs and labels
        inputs = inputs.to(device)
        labels = labels.to(device)
        # forward pass 
        preds = model(inputs)
        loss  = criterion(preds, labels)
        # normalize loss to account for batch accumulation
        loss = loss / accum_iter 
        # backward pass


This example is a refactoring of this post to reflect your problem statement.

Thanks for the comment and the reference!

1 Like