I have a dataset includes thousands of images and the resolution is 2048 by 2048. I want to use a small patch (like 256 by 256) as a single example. One image can generate couples(like 20 or 50) of examples and they are composed to a training batch as the input. If I use the Dataset class and DataLoader function, the DataLoader will read a single image and crop a patch then read another image and crop another patch and so on. Then all these patch from different images are composed to a image. That is not what I want to do. I want one batch come from one image.
Is there some way to Manually generate training batch?
1 Like
You could crop the image patches manually in the __getitem__
function and then stack the crops into the batch dimension.
I’ve written a small example using torchvision.FiveCrop.
Since you need more crops, you should adapt the code to your crop function.
class MyDataset(Dataset):
def __init__(self, images):
self.images = images
self.crop = transforms.FiveCrop(size=10)
def __getitem__(self, index):
image = self.images[index]
# image dim: [3, 100, 100]
# Crop to smaller image patches
crops = self.crop(image)
crops = torch.stack([transforms.ToTensor()(crop) for crop in crops])
# Crops dim: [5, 3, 10, 10]
return crops
def __len__(self):
return len(self.images)
# Create random images
images = [transforms.ToPILImage()(x) for x in torch.randn(10, 3, 100, 100)]
dataset = MyDataset(images)
loader = DataLoader(dataset, batch_size=2)
for batch_idx, data in enumerate(loader):
bs, ncrops, c, h, w = data.size()
# Reshape ncrops into batch size
data = data.view(-1, c, h, w)
# Continue with training procedure
If you want the whole batch only from one single image, you would have to set the batch_size
to 1 in the DataLoader
and return your multiple crops.
2 Likes
Thank you. Your answer solve my problem.