Manually generate training batch

I have a dataset includes thousands of images and the resolution is 2048 by 2048. I want to use a small patch (like 256 by 256) as a single example. One image can generate couples(like 20 or 50) of examples and they are composed to a training batch as the input. If I use the Dataset class and DataLoader function, the DataLoader will read a single image and crop a patch then read another image and crop another patch and so on. Then all these patch from different images are composed to a image. That is not what I want to do. I want one batch come from one image.

Is there some way to Manually generate training batch?

1 Like

You could crop the image patches manually in the __getitem__ function and then stack the crops into the batch dimension.
I’ve written a small example using torchvision.FiveCrop.
Since you need more crops, you should adapt the code to your crop function.

class MyDataset(Dataset):
    def __init__(self, images):
        self.images = images
        self.crop = transforms.FiveCrop(size=10)
        
    def __getitem__(self, index):
        image = self.images[index]
        # image dim: [3, 100, 100]
        # Crop to smaller image patches
        crops = self.crop(image)
        crops = torch.stack([transforms.ToTensor()(crop) for crop in crops])
        # Crops dim: [5, 3, 10, 10]
        
        return crops
        
    def __len__(self):
        return len(self.images)

# Create random images
images = [transforms.ToPILImage()(x) for x in torch.randn(10, 3, 100, 100)]
dataset = MyDataset(images)

loader = DataLoader(dataset, batch_size=2)

for batch_idx, data in enumerate(loader):
    bs, ncrops, c, h, w = data.size()
    # Reshape ncrops into batch size
    data = data.view(-1, c, h, w)
    # Continue with training procedure

If you want the whole batch only from one single image, you would have to set the batch_size to 1 in the DataLoader and return your multiple crops.

2 Likes

Thank you. Your answer solve my problem.