I have a smaller image-dataset as numpy arrays and want to transform data-augmentation on it, but there doesn’t seem a way to use them with torchvision?
The torch.utils.data
doesn’t have a transform parameter and torchvision.datasets
doesn’t have a numpy-dataset.
I don’t have the dataset the way I need it on the drive (it get’s composed out of multiple datasets based on the problem I want to solve, like: I need this class from this dataset, this class from that etc.).
If you have numpy arrays, you can convert them to PIL Image format, and then apply data augmentation techniques in torchvision.transforms
. The transformation is as follows:
- If array of type
uint8
:
from PIL import Image
im = Image.fromarray(np_arr)
- If array has type float:
from PIL import Image
img = Image.fromarray((np_arr*255).astype('uint8'))
Maybe I’ve misunderstood something…but then I’ve just transformed my training data and not augmented it? I could save 10 versions of transformations per datapoint, but at some point, this is going to be expensive.
(I don’t have that many datapoints and need intensive data-augmentation, which worked pretty well in keras).
I thought torchvision.datasets
applies them per epoch (or batch)?
Yes, with torchvision.transforms
you can apply different random data augmentations like RandomHorizontalFlip
, RandomResizedCrop
, … and you can also combine a list of such transformations together using torchvision.transforms.Compose(transforms)
.
But note that all these transformations need the data to be in PIL Image format. So, you can first convert your numpy array to PIL Image, and then apply the transformations. At the end, you torchvision.transforms.ToTensor()
will convert the final image to a Tensor.
this is not working. Image.fromarray
needs a single image, and not the dataset. Also, if I follow your code, the transformation is the same for every epoch. It’s also not how it’s used in the tutorial (https://pytorch.org/tutorials/beginner/data_loading_tutorial.html, compare the FaceLandmarksDataset and it’s __getitem__
method).
So, is there a generic numpy dataset? It seems trivial to build, but I can’t really believe that it’s not per default in torchvision.
I assumed np_arr
is just one image, not the entire dataset. Then, you would need to iterate through np_arr
to get each image, convert them to image using Image.fromarray()
and then apply the transformations on that image.
Everytime __getitem__
is called, a random transformation will be performed on the image.
I have a pseudo code as follows (assume data_arr
is the entire dataset, with shape 1000, 784
corresponding to 1000 images of size 28x28
)
class CelebA(data.Dataset):
def __init__(self):
self.data_arr = ... # define the data-array (load from file)
self.labels = ... # define the labels
self.transform = transforms.Compse([
transforms.RandomCrop(20),
transforms.RandomHorizontalFlip(),
transforms.ToTensor()])
def __getitem(self, index):
np_arr = self.data_arr[index, :]
y = self.labels[index]
## reshape np_arr to 28x28
np_arr = np_arr.reshape(28, 28)
## convert to PIL-image
img = Image.fromarray((np_arr*255).astype('uint8'))
#apply the transformations and return tensors
return self.transform(img), torch.FloatTensor(y)
Note that this code is not complete, and I just wanted to show the general idea on how to make a class Dataset from a numpy array.
Ah, i misunderstood your example! Yeah, i need the dataset class
No problem! Please try the Dataset and let me know if there is any other issue.