Cropping images into 128*128 images

chiheb48 · May 9, 2023, 6:07pm

Hello,
I’m working with Challenge Learned Image Compression dataset, it has different size images, I am using the code below to crop the images, one i use dataloader and want to start to train this error occurs:
RuntimeError: each element in list of batch should be of equal size.

But I made sure that every block is 1281283.

import os
import numpy as np
import torch
from PIL import Image
from torch.utils.data import Dataset

class CustomDataset(Dataset):
def init(self, root_dir, transform =None):
self.root_dir = root_dir
self.image_filenames = os.listdir(root_dir)
self.transform = transform

def __len__(self):
    return len(self.image_filenames)


def __getitem__(self, idx):
    image_path = os.path.join(self.root_dir, self.image_filenames[idx])
    image = Image.open(image_path).convert('RGB')
    image = np.array(image)
    
    h,w = image.shape[0], image.shape[1]

    mod_h = h%128
    mod_width = w%128

    pad1 = 128-mod_h
    pad2 = 128-mod_width

    quotient_h = (h+pad1)//128
    quotient_w = (w+pad2)//128

    pad = ((pad1, 0), (pad2, 0), (0, 0))
    # img = np.pad(img, pad, 'constant', constant_values=0) / 255
    image = np.pad(image, pad, mode="edge") 
    # Crop the image into 128x128 blocks
    height, width, _ = image.shape
    cropped_images = []
    for i in range(0, height, 128):
        for j in range(0, width, 128):
            cropped_image = image[i:i+128, j:j+128, :]
            if cropped_image.shape == (128, 128, 3):
                cropped_images.append(cropped_image)

    # Convert the cropped images to PyTorch tensors
    cropped_images = [torch.from_numpy(cropped_image.transpose((2, 0, 1))).float() / 255.0 for cropped_image in cropped_images]
   
    if self.transform:
        cropped_images = [self.transform(cropped_image) for cropped_image in cropped_images]
    return cropped_images

import torch.optim as optim
from torch.utils.data import DataLoader
from torchvision import transforms
from torchvision.transforms import ToTensor

Load the dataset

dataset = CustomDataset(‘C:/Users/mhund/Downloads/AE/data_1/train’,transform=transforms.Compose([
transforms.ToPILImage(),
transforms.ToTensor()
]))

dataloader = DataLoader(dataset, batch_size=32, shuffle=True)

Define the autoencoder model and optimizer

autoencoder = Autoencoder().cuda()
optimizer = optim.Adam(autoencoder.parameters(), lr=1e-4)

Train the autoencoder

for epoch in range(0,100):
for i,batch in enumerate(dataloader): # here it occurs
print(i)

Any suggestions? Thank you

eqy · May 10, 2023, 4:41am

I think the issue is that you are effectively returning a “batch” of images (e.g., each crop can be considered a separate image) in your __getitem__ function. When the dataloader is is assembling a batch from your dataset, it will call __getitem__ 32 times (your batch size) so the batch it will collate will really look something like

[[image0crop1, image0crop2, ..., image0cropn], [image1crop1, image1crop2, ..., image1cropm], ..., [imagekcrop1, imagekcrop2, ..., imagekcropl]]

where n, m, l, can all be different.
. If different images in your dataset are different sizes, then each could potentially have different numbers of crops, leading to the error. You could work around this by resizing each image in the dataset to a fixed resolution before cropping, or rethink how you want to batch your crops.

chiheb48 · May 23, 2023, 7:32pm

Thank you, I took your advice, and it solved every problem I had.