Keras has image data generators for automatic rotations/transformations/distortions, is there an example doing this in pytorch

Pretty much the question in the title.

Someone asked this a year ago, but I think they did not receive a satisfying answer. None of the official pytorch examples use clean code.

For example, the pytorch DataLoader uses a batchsize parameter, but if someone writes their transformations in their dataset class, then that batchsize parameter is no longer adhered to, because the dataloader would then be generating a batch of size batchsize*however_many_transformations_are_applied_to_a_single_sample

Certainly this must have been thought of, so can someone please point me in the direction of a tutorial or example that addresses this discrepancy?

thanks!

Could you point to the examples where you have the feeling the code is not clean?

Usually random rotations, distortions etc. are used per-sample, so that the batch size stays the same.
There are a few exceptions, e.g. FiveCrop which return five crops for a single image.
You can just use the torchvision.transforms on a single image and return it.
I’m not sure, what the Keras generator does differently, so could you explain your use case a bit, e.g. what kind of transformation you want to use and which give you problems using them?

1 Like

I don’t think this is true.

Are you proposing that the dataset always should produce a single sample x, y for each call to getitem?

If so, how does one augment the dataset so that it incorporates random rotations/crops/shifts like here:

https://keras.io/preprocessing/image/

?

The only solution I can see is that it will randomly select a sample and then randomly select a transformation, and then produce a single sample

I mentioned that I don’t think any of the examples are clean.

Also, if the random rotations/distortions/etc. are used per-sample, does that mean that the original sample could potentially never be used for training? In keras, the augmentation produces additional samples. Is this not the case for pytorch? In other words, is there any way to train on the original sample, as well as whatever transformations I want to apply to the data? For context, I don’t want to concatenate a transformed/modified dataset to the original dataset prior to training

The usual approach is to just implement the code to load and process one single sample, yes.
That makes it quite easy to write your own code as you don’t have to take care of the batching.
The DataLoader will take care of it even using multiprocessing.

If you want to apply multiple transformations on your data, you could just compose them:

data_transform = transforms.Compose([
        transforms.RandomSizedCrop(224),
        transforms.RandomHorizontalFlip(),
        transforms.ToTensor(),
        transforms.Normalize(mean=[0.485, 0.456, 0.406],
                             std=[0.229, 0.224, 0.225])
    ])

dataset = MyDataset(image_paths, transforms=data_transform)

The transformations won’t be randomly selected, but applied in the order you’ve created them.
If you want to pick a transformation randomly, you can use RandomChoice.
Otherwise the transformation will be applied in order as you pass them (or apply them in your Dataset).
If you would like to rotate your images before flipping them (for whatever reason), just change the order of your transforms.

I think you are also wrong on this point.

Generate batches of tensor image data with real-time data augmentation.

This does not sound as if the original samples are created before the augmented ones.

As I’m not that familiar with Keras, feel free to correct me, but using this code I cannot get the original sample from the DataGenerator:

data_dir = './dummy_image/'
image = Image.open(data_dir + 'class0/dummy_image.jpg')
im_arr = np.array(image)

datagen = ImageDataGenerator(
        rescale=None,
        shear_range=0.2,
        zoom_range=0.2,
        horizontal_flip=True)

train_generator = datagen.flow_from_directory(
        data_dir,
        target_size=im_arr.shape[:-1],
        batch_size=1,
        class_mode='binary')

x_1, _ = train_generator.next()
f, axarr = plt.subplots(1, 2)
axarr[0].imshow(x_1[0].astype(np.uint8))
axarr[1].imshow(im_arr)
plt.show()

for idx, (x, y) in enumerate(train_generator):
    x = x.astype(np.uint8).squeeze()
    print('Iter {}, Abs error {}, x.min {}, x.max{}, im.min {}, im.max {}'.format(
        idx, np.mean(np.abs(x-im_arr)), x.min(), x.max(), im_arr.min(), im_arr.max()))
    if np.allclose(x, im_arr):
        break
    plt.imshow(np.abs(x-im_arr))
    plt.show()

Note that I’ve created two folders (class0, class1) with the same single image inside both of them.

1 Like