Transforming dataset not working?

ajhanwar · July 26, 2019, 5:22pm

Hey all! I’m using the MNIST dataset available through torchvision and trying to use transform operations to create synthetic data.

In addition to a regular train_set where I only used transforms.ToTensor(), I wrote the following with the intention of appending it to the original train_set:

train_set2 = torchvision.datasets.MNIST(
    root='./data',
    train=True,
    download=True,
    transform=transforms.Compose([
        transforms.RandomAffine(degrees=20, 
                                translate=(0.9, 0.9), 
                                scale=(0.9, 1.1), 
                                shear=(-20, 20)),
        transforms.ToTensor()
    ])
)

However, when I view the images that are produced through the extracting and transforming of the dataset there does not appear to be any difference in how the images look at all.

For example, the following are my results:

plt.imshow(train_set.data[0])

plt.imshow(train_set2.data[0])

Any clarification would be greatly appreciated!

John_Deterious · July 26, 2019, 7:07pm

How could you send torch.tensor data type to plt.imshow? something smells here

ajhanwar · July 26, 2019, 7:10pm

Well when I convert it to a numpy array it produces the same results… more so confused about why there isn’t any sort of transformation of the data occurring

ptrblck · July 27, 2019, 9:23pm

You are directly indexing the underlying non-transformed data using train_set2.data.
Try to index the Dataset, to get the transformed tensors train_set2[index].