Image data augmentation for numpy data

I have a smaller image-dataset as numpy arrays and want to transform data-augmentation on it, but there doesn’t seem a way to use them with torchvision?
The torch.utils.data doesn’t have a transform parameter and torchvision.datasets doesn’t have a numpy-dataset.
I don’t have the dataset the way I need it on the drive (it get’s composed out of multiple datasets based on the problem I want to solve, like: I need this class from this dataset, this class from that etc.).

If you have numpy arrays, you can convert them to PIL Image format, and then apply data augmentation techniques in torchvision.transforms. The transformation is as follows:

  • If array of type uint8:
from PIL import Image
im = Image.fromarray(np_arr)
  • If array has type float:
from PIL import Image
img = Image.fromarray((np_arr*255).astype('uint8'))

Maybe I’ve misunderstood something…but then I’ve just transformed my training data and not augmented it? I could save 10 versions of transformations per datapoint, but at some point, this is going to be expensive.

(I don’t have that many datapoints and need intensive data-augmentation, which worked pretty well in keras).

I thought torchvision.datasets applies them per epoch (or batch)?

Yes, with torchvision.transforms you can apply different random data augmentations like RandomHorizontalFlip, RandomResizedCrop , … and you can also combine a list of such transformations together using torchvision.transforms.Compose(transforms).

https://pytorch.org/docs/stable/torchvision/transforms.html?highlight=randomhorizontalflip#torchvision.transforms.RandomHorizontalFlip

But note that all these transformations need the data to be in PIL Image format. So, you can first convert your numpy array to PIL Image, and then apply the transformations. At the end, you torchvision.transforms.ToTensor() will convert the final image to a Tensor.

this is not working. Image.fromarray needs a single image, and not the dataset. Also, if I follow your code, the transformation is the same for every epoch. It’s also not how it’s used in the tutorial (https://pytorch.org/tutorials/beginner/data_loading_tutorial.html, compare the FaceLandmarksDataset and it’s __getitem__ method).

So, is there a generic numpy dataset? It seems trivial to build, but I can’t really believe that it’s not per default in torchvision.

I assumed np_arr is just one image, not the entire dataset. Then, you would need to iterate through np_arr to get each image, convert them to image using Image.fromarray() and then apply the transformations on that image.

Everytime __getitem__ is called, a random transformation will be performed on the image.

I have a pseudo code as follows (assume data_arr is the entire dataset, with shape 1000, 784 corresponding to 1000 images of size 28x28)

class CelebA(data.Dataset):

    def __init__(self):
          self.data_arr = ... # define the data-array (load from file)
          self.labels = ... # define the labels

          self.transform = transforms.Compse([
              transforms.RandomCrop(20),
              transforms.RandomHorizontalFlip(),
              transforms.ToTensor()])

    def __getitem(self, index):
         np_arr = self.data_arr[index, :]
         y = self.labels[index]

         ## reshape np_arr to 28x28
         np_arr = np_arr.reshape(28, 28)

         ## convert to PIL-image
         img = Image.fromarray((np_arr*255).astype('uint8'))

         #apply the transformations and return tensors
         return self.transform(img), torch.FloatTensor(y)
         

Note that this code is not complete, and I just wanted to show the general idea on how to make a class Dataset from a numpy array.

Ah, i misunderstood your example! Yeah, i need the dataset class

No problem! Please try the Dataset and let me know if there is any other issue.