Cropping batches at the same position

mhusseinsh · September 5, 2018, 11:40am

As far as I understand, that RandomCropping is performing random crops of given size for each batch, is there a way, that I can strict the same location, a crop is taken from ?

InnovArul · September 5, 2018, 12:08pm

we have torchvision.transforms.CenterCrop. Does that help you?

mhusseinsh · September 5, 2018, 12:10pm

it always give a crop around the center, right ?, so if all images/tensors, are of same size, the crop place will be the same

mhusseinsh · September 5, 2018, 12:12pm

@InnovArul I will try to say clarify exactly what I want

It is ok to have RandomCrop in my case, but what I want that the random position changes every 2nd batch

so for batch 1, the crop is taken from position (x,y), and from batch 2, the same position (x,y), but batch 3 and 4, will be from a different random position, and so on

I know this is something really special in my case, but is there any way to have this ?

ptrblck · September 5, 2018, 12:54pm

You could use a counter to chose to resample the random crop parameters or reuse them.
Here is a small (untested) example:

class MyDataset(Dataset):
    def __init__(self, image_paths):
        self.image_paths = image_paths
        self.crop_indices = []

    def transform(self, image, resample):
        # Resize
        resize = transforms.Resize(size=(520, 520))
        image = resize(image)

        # Random crop
        if resample:
            self.crop_indices = transforms.RandomCrop.get_params(
                image, output_size=(512, 512))
        i, j, h, w = self.crop_indices
        image = TF.crop(image, i, j, h, w)

        # Random horizontal flipping
        if random.random() > 0.5:
            image = TF.hflip(image)

        # Random vertical flipping
        if random.random() > 0.5:
            image = TF.vflip(image)

        # Transform to tensor
        image = TF.to_tensor(image)
        return image

    def __getitem__(self, index):
        image = Image.open(self.image_paths[index])
        resample = index % 2 == 0
        x = self.transform(image, resample)
        return x 

    def __len__(self):
        return len(self.image_paths)

Let me know, if that works for you.

mhusseinsh · September 21, 2018, 5:30am

Hello @ptrblck
Sorry for my late reply

Can RandomCrop works on tensors not images ?
I will try to explain why

I have like this

def get_data_loader_folder(input_folder, batch_size, train, new_size=None,
                           height=256, width=256, num_workers=4, crop=True):
    transform_list = [transforms.ToTensor(),
                      transforms.Normalize((0.5, 0.5, 0.5),
                                           (0.5, 0.5, 0.5))]
    transform_list = [transforms.RandomCrop((height, width))] + transform_list if crop else transform_list
    transform_list = [transforms.Resize((256, 256))] + transform_list if new_size is not None else transform_list
    transform_list = [transforms.RandomHorizontalFlip()] + transform_list if train else transform_list
    transform = transforms.Compose(transform_list)
    dataset = ImageFolder(input_folder, transform=transform)
    loader = DataLoader(dataset=dataset, batch_size=batch_size, shuffle=train, drop_last=True, num_workers=num_workers)
    return loader

class ImageFolder(data.Dataset):

    def __init__(self, root, transform=None, return_paths=False,
                 loader=default_loader):
        #imgs = sorted(make_dataset(root))
        # shuffle implicit pairs
        if "test" in root: 
            imgs = sorted(make_dataset(root))
        else:
            imgs = shuffle_pairs(sorted(make_dataset(root)))
        if len(imgs) == 0:
            raise(RuntimeError("Found 0 images in: " + root + "\n"
                               "Supported image extensions are: " +
                               ",".join(IMG_EXTENSIONS)))

        self.root = root
        self.imgs = imgs
        self.transform = transform
        self.return_paths = return_paths
        self.loader = loader

    def __getitem__(self, index):
        path = self.imgs[index]
        img = self.loader(path)
        if self.transform is not None:
            img = self.transform(img)
        if self.return_paths:
            return img, path
        else:
            return img

    def __len__(self):
        return len(self.imgs)

So when I call get_data_loader_folder(), I set the crop=False, because I don’t want RandomCrop in all batches, but as I told you before, that I want to crop at the same location for every 2 sequential batches

in the class ImageFolder()
here is what happens, it applies all the transformation composed based on the transform_list, then it returns a tensor with the transformations applied … back again to the RandomCrop

Is there a way to apply it here ?

    def __getitem__(self, index):
        path = self.imgs[index]
        img = self.loader(path)
        if self.transform is not None:
            img = self.transform(img)
            ##RANDOM CROP##
        if self.return_paths:
            return img, path

By taking into the consideration that the returned img from self.transform is a tensor now … and if not ? do you have any tips or hints how to achieve what I want in this case ?

Thanks a lot … your help is totally appreciated

mhusseinsh · September 22, 2018, 5:58pm

hey @ptrblck, sorry to disturb
any help here ?

ptrblck · September 22, 2018, 6:24pm

Hi Mostafa,
sorry for the late reply. I’ve seen the thread but have completely forgotten to answer.

RandomCrop works on images and your current code should also work as you put all image transformations before ToTensor. So if no image transformations are given, your data will just be transformed to a tensor and normalized. Otherwise it will be cropped randomly etc. before the ToTensor transform.
Your code makes sense and I would like to stick to it.
Therefore we would need to create our own RandomCrop class.
We can derive from transforms.RandomCrop and just add the “counter” from my previous example so that the crop indices will be resampled in every second iteration:

class MyRandomCrop(transforms.RandomCrop):
    def __init__(self, size, padding=0, pad_if_needed=False):
        super(MyRandomCrop, self).__init__(size, padding, pad_if_needed)
        self.counter = 0
        self.crop_indices = []
        
    def __call__(self, img):
        if self.padding is not None:
            img = F.pad(img, self.padding, self.fill, self.padding_mode)

        # pad the width if needed
        if self.pad_if_needed and img.size[0] < self.size[1]:
            img = F.pad(img, (self.size[1] - img.size[0], 0), self.fill, self.padding_mode)
        # pad the height if needed
        if self.pad_if_needed and img.size[1] < self.size[0]:
            img = F.pad(img, (0, self.size[0] - img.size[1]), self.fill, self.padding_mode)
        
        resample = self.counter % 2 == 0
        self.counter += 1
        if resample:
            self.crop_indices = self.get_params(img, self.size)
        i, j, h, w = self.crop_indices

        return F.crop(img, i, j, h, w)


transform = transforms.Compose([
    MyRandomCrop((10, 10)),
    transforms.ToTensor()
])
        
class MyDataset(Dataset):
    def __init__(self, image_paths, transform):
        self.image_paths = image_paths
        self.transform = transform
    
    def __getitem__(self, index):
        image = Image.open(self.image_paths[index])
        x = self.transform(image)
        return x
    
    def __len__(self):
        return len(self.image_paths)

Depending on the torchvision version you are using you might need to adapt the arguments to RandomCrop.
Let me know, if that works for you!

mhusseinsh · September 22, 2018, 6:27pm

no problem @ptrblck
so you say that I create this class on my own, then here I do like this ?

def get_data_loader_folder(input_folder, batch_size, train, new_size=None,
                           height=256, width=256, num_workers=4, crop=True):
    transform_list = [transforms.ToTensor(),
                      transforms.Normalize((0.5, 0.5, 0.5),
                                           (0.5, 0.5, 0.5))]
    transform_list = [MyRandomCrop((height, width))] + transform_list if crop else transform_list
    transform_list = [transforms.Resize((256, 256))] + transform_list if new_size is not None else transform_list
    transform_list = [transforms.RandomHorizontalFlip()] + transform_list if train else transform_list
    transform = transforms.Compose(transform_list)
    dataset = ImageFolder(input_folder, transform=transform)
    loader = DataLoader(dataset=dataset, batch_size=batch_size, shuffle=train, drop_last=True, num_workers=num_workers)
    return loader

correct me if I am wrong

ptrblck · September 22, 2018, 6:28pm

Exactly! Let me know if you get any errors.

mhusseinsh · September 22, 2018, 6:29pm

but in the arguments, do I need to path width and height as I am doing ?

ptrblck · September 22, 2018, 6:38pm

Yes you are using it right.
I used up the current master branch of torchvision and tried to post code for 0.2.1 forgetting about some fixes.
Here is the fixed code:

import torchvision.transforms.functional as TF

class MyRandomCrop(transforms.RandomCrop):
    def __init__(self, size, padding=0, pad_if_needed=False):
        super(MyRandomCrop, self).__init__(size, padding, pad_if_needed)
        self.counter = 0
        self.crop_indices = []
        
    def __call__(self, img):
        if self.padding > 0:
            img = TF.pad(img, self.padding)

        # pad the width if needed
        if self.pad_if_needed and img.size[0] < self.size[1]:
            img = TF.pad(img, (int((1 + self.size[1] - img.size[0]) / 2), 0))
        # pad the height if needed
        if self.pad_if_needed and img.size[1] < self.size[0]:
            img = TF.pad(img, (0, int((1 + self.size[0] - img.size[1]) / 2)))
        
        resample = self.counter % 2 == 0
        self.counter += 1
        if resample:
            self.crop_indices = self.get_params(img, self.size)
        i, j, h, w = self.crop_indices
        print('Using {} {} {} {}'.format(i, j, h, w))

        return TF.crop(img, i, j, h, w)

Let me know, if that works now!

mhusseinsh · September 22, 2018, 6:54pm

@ptrblck

I tried it, and it is always giving the same (i,j)

Using 0 0 256 256
Using 0 0 256 256
Using 0 0 256 256
Using 0 0 256 256
Using 0 0 256 256
Using 0 0 256 256
Using 0 0 256 256
Using 0 0 256 256
Using 0 0 256 256
Using 0 0 256 256
Using 0 0 256 256
Using 0 0 256 256
Using 0 0 256 256
Using 0 0 256 256
Using 0 0 256 256
Using 0 0 256 256
Using 0 0 256 256
Using 0 0 256 256
Using 0 0 256 256
Using 0 0 256 256
Using 0 0 256 256
Using 0 0 256 256
Using 0 0 256 256
Using 0 0 256 256
Using 0 0 256 256
Using 0 0 256 256
Using 0 0 256 256
Using 0 0 256 256
Using 0 0 256 256
Using 0 0 256 256
Using 0 0 256 256
Using 0 0 256 256
Using 0 0 256 256
Using 0 0 256 256
Using 0 0 256 256
Using 0 0 256 256
Using 0 0 256 256
Using 0 0 256 256
Using 0 0 256 256
Using 0 0 256 256
Using 0 0 256 256
Using 0 0 256 256
Using 0 0 256 256
Using 0 0 256 256

mhusseinsh · September 22, 2018, 6:55pm

This is how I have done it

def get_data_loader_folder(input_folder, batch_size, train, new_size=None,
                           height=256, width=256, num_workers=4, crop=True):
    transform_list = [transforms.ToTensor(),
                      transforms.Normalize((0.5, 0.5, 0.5),
                                           (0.5, 0.5, 0.5))]
    #transform_list = [transforms.RandomCrop((height, width))] + transform_list if crop else transform_list
    transform_list = [MyRandomCrop((height, width))] + transform_list
    transform_list = [transforms.Resize((256, 256))] + transform_list if new_size is not None else transform_list
    transform_list = [transforms.RandomHorizontalFlip()] + transform_list if train else transform_list
    transform = transforms.Compose(transform_list)
    dataset = ImageFolder(input_folder, transform=transform)
    loader = DataLoader(dataset=dataset, batch_size=batch_size, shuffle=train, drop_last=True, num_workers=num_workers)
    return loader

class ImageFolder(data.Dataset):

    def __init__(self, root, transform=None, return_paths=False,
                 loader=default_loader):
        #imgs = sorted(make_dataset(root))
        # shuffle implicit pairs
        if "test" in root: 
            imgs = sorted(make_dataset(root))
        else:
            imgs = shuffle_pairs(sorted(make_dataset(root)))
        if len(imgs) == 0:
            raise(RuntimeError("Found 0 images in: " + root + "\n"
                               "Supported image extensions are: " +
                               ",".join(IMG_EXTENSIONS)))

        self.root = root
        self.imgs = imgs
        self.transform = transform
        self.return_paths = return_paths
        self.loader = loader

    def __getitem__(self, index):
        path = self.imgs[index]
        img = self.loader(path)
        if self.transform is not None:
            img.save('x_a1.png')
            img = self.transform(img)
            vutils.save_image(img, 'x_a2.png', nrow=1)
            #exit()
        if self.return_paths:
            return img, path
        else:
            return img

    def __len__(self):
        return len(self.imgs)

# RandomCrop modified
class MyRandomCrop(transforms.RandomCrop):
    def __init__(self, size, padding=0, pad_if_needed=False):
        super(MyRandomCrop, self).__init__(size, padding, pad_if_needed)
        self.counter = 0
        self.crop_indices = []
        
    def __call__(self, img):
        if self.padding > 0:
            img = TF.pad(img, self.padding)

        # pad the width if needed
        if self.pad_if_needed and img.size[0] < self.size[1]:
            img = TF.pad(img, (int((1 + self.size[1] - img.size[0]) / 2), 0))
        # pad the height if needed
        if self.pad_if_needed and img.size[1] < self.size[0]:
            img = TF.pad(img, (0, int((1 + self.size[0] - img.size[1]) / 2)))
        
        resample = self.counter % 2 == 0
        self.counter += 1
        if resample:
            self.crop_indices = self.get_params(img, self.size)
        i, j, h, w = self.crop_indices
        print('Using {} {} {} {}'.format(i, j, h, w))

        return TF.crop(img, i, j, h, w)

ptrblck · September 22, 2018, 6:56pm

What is your image size? The random crop size should be smaller than the image size.
Also the Resize transformation won’t do anything, if your crops are already in the same size.

mhusseinsh · September 22, 2018, 6:58pm

I am loading 2 datasets,
one has size (400, 300), and the other has size (800,600)

concerning the resizing, yes for sure, it won’t do anything, i know this, it can be ignored

ptrblck · September 22, 2018, 7:01pm

That’s strange. I’m testing it with images of size [3, 690, 334] and it seems to work:

Using 34 337 256 256
Using 34 337 256 256
Using 72 102 256 256
Using 72 102 256 256
Using 57 382 256 256
Using 57 382 256 256

mhusseinsh · September 22, 2018, 7:02pm

it is weird, I just copied what you wrote without editing anything

ptrblck · September 22, 2018, 7:04pm

Could you run this code and see it the last line returns different values?

img = transforms.ToPILImage()(torch.randn(3, 600, 600))
crop = MyRandomCrop((256, 256))
crop.get_params(img, (256, 256))

If so, could you try this afterwards:

cropped = crop(img)

mhusseinsh · September 22, 2018, 7:10pm

I did that

 for i in range(10):
        img = transforms.ToPILImage()(torch.randn(3, 600, 600))
        crop = MyRandomCrop((256, 256))
        print(crop.get_params(img, (256, 256)))

and I got

(228, 344, 256, 256)
(75, 317, 256, 256)
(190, 65, 256, 256)
(152, 243, 256, 256)
(74, 312, 256, 256)
(246, 237, 256, 256)
(137, 43, 256, 256)
(29, 186, 256, 256)
(320, 141, 256, 256)
(334, 326, 256, 256)