How to create image crops in dataloader while training instead of saving offline beforehand?

I have several (2160,2560) images that I am splitting into (64,64) crops to run through my network. This means each image will generate 1350 crops.
What would be the most efficient way to obtain these crops to run through my network?
Right now, I’m thinking I could generate and save all the crops before training. I would create several folders that each contain 1350 crops from the image and have the folder name as the name of the original image. Then, I could just load them all in or use the ImageFolder dataloader. However, this takes up a lot of time before I can start training.
How can I load the image into memory and create crops on the go without saving them beforehand?
Thank you!

1 Like

If you’re looking for random crops in the image, you can use random crop transforms

I would actually like to take every possible non-overlapping crop of the image.

What sample size are you working with?

Do you mean how many images? I have several hundred images, each of which will need to be split into 1350 sized crops. There will probably be roughly 300,000 tiles total.

I would suggest the following approach where you save cropped images

eg. for classfication in two classes

  1. Convert all images to numpy array → save [cropped_image_pixelarray, class]
  2. Create a dataset class for loading from saved array

You can use similar code for getting cropped images:

Loop over all images, open then as cv2 images and use this function (i have not verified if the function works, but something similar should suffice)

def process_data(image, label, df):
    '''
    image: cv2 image
    label: numpy array label eg. [0, 1] or [1, 0]
    df: main df
    '''
    image_x = 2160
    image_y = 2560
    crop = 64
    for i in range(math.floor(image_x/crop)): # loop over width 
        for y in range(math.floor(image_y/crop)):  # loop over height
            new_image = image[i*crop : (i+1)*crop, y*crop : (y+1)*crop]
            df.append([np.array(new_image), label]) 

np.save('data.npy', df)

The Dataset class would look like:

class Data(Dataset):
    def __init__(self, data_path, transform=None):
        self.data = np.load(data_path, allow_pickle=True)
        self.transform = transform

    def __len__(self):
        return len(self.data)

    def __getitem__(self, x):
        # we saved data as [image_data, label]
        label = torch.from_numpy(data[x][1])
        image = torch.from_numpy(data[x][0])
        if self.transform is not None:
            image = self.transform(image)
        return image, label

Hope this helps
thanks

Thank you for your suggestion. This involves saving data offline before training begins right? Is there a way to begin training before the crops are done saving? Or alternatively, is there a way to just load the (2160,2560), store its corresponding 1350 crops in memory, and run it that way without saving it beforehand? This is to save time.

You mean cropping the images in dataloader but not saving those cropped images before training? I wrote a function to implement this idea, you can check if it fits your demand:
def read_images(step, crop_size):
assert step in [‘train’, ‘valid’, ‘test’]
features, labels = ,
if step == ‘trai’:
img_path = train_images
mask_path = train_labels
for file in os.listdir(img_path):
feature = Image.open(os.path.join(img_path, file)) # (H,W,C) 600060003
label = Image.open(os.path.join(mask_path, file))
width, height = feature.size
stride = 256
for i in range(0, width, stride):
for j in range(0, height, stride):
box = (i, j, i + crop_size, j + crop_size)
cropped_feature = feature.crop(box)
cropped_label = label.crop(box)
features.append(to_tensor(cropped_feature))
labels.append(to_tensor(cropped_label))

elif step == 'valid':
    img_path = valid_images
    mask_path = valid_labels
    for file in os.listdir(img_path):
        feature = Image.open(os.path.join(img_path, file))
        label = Image.open(os.path.join(mask_path, file))
        width, height = feature.size
        stride = 250
        for i in range(0, width, stride):
            for j in range(0, height, stride):
                box = (i, j, i + crop_size, j + crop_size)
                cropped_feature = feature.crop(box)
                cropped_label = label.crop(box)
                features.append(to_tensor(cropped_feature))
                labels.append(to_tensor(cropped_label))
else:
    img_path = test_images
    for file in os.listdir(img_path):
        feature = Image.open(os.path.join(img_path, file))
        width, height = feature.size
        stride = 250
        for i in range(0, width, stride):
            for j in range(0, height, stride):
                box = (i, j, i + crop_size, j + crop_size)
                cropped_feature = feature.crop(box)
                features.append(to_tensor(cropped_feature))
return features, labels

I used this method to train my model, but I found it occupying a lot of memory. It is not efficient. I have noticed that it has been more than 2 years since you asked this question, please let me know if you have solved it in a proper way, I am wondering how to do it efficiently, thanks! :blush: