MNIST iterator not working after normalization

Torcione · May 16, 2023, 9:55pm

After defining the transformation

            transform = transforms.Compose([
                        transforms.ToTensor(), #NOTE: Converts a PIL Image or numpy.ndarray (H x W x C) in the range [0, 255] to a torch.FloatTensor of shape (C x H x W) in the range [0.0, 1.0]
                        transforms.Normalize((DataMean,), (DataStd,))])

where DataMean DataStd are mean and std computed over the train data
I define the train dataset

            train_data = datasets.MNIST(root = ./data, train = True, download = True, transform = transform)

and build the dataloader (for my problem I need a separate dataloader for each class “i”)

TrainDL[i] = torch.utils.data.DataLoader(train_data, batch_size = 100, 
                                                       sampler = train_sampler, num_workers = 0)

Later in the code I want to iterate over the dataloaders, namely

        for EvalKey in TrainDL:
            for dataval,labelval in TrainDL[EvalKey]:

Here I get the following error

ValueError: Too many dimensions: 3>2

I read here that this is due to the Normalization transformation. Is not clear to me why the third dimension should cause this problem (CIFAR10 also has 3 channels and works perfectly fine). Is it due to a difference between the expected dimension of get item and thart you got after initialization? what is the easiest way to solve the problem?

ptrblck · May 16, 2023, 11:56pm

MNIST images have a single channel only so normalizing these samples with 3 stats values will fail.

Torcione · May 19, 2023, 12:07pm

sorry I meant 3 dimensions (I corrected my question), referring to ,your answer from the post in the link, i.e.:

img will have 3 dimensions ([1, 28, 28]), which will yield this error.

In my case for the normalization I defined two functions to compute directly mean and std on the MNIST dataset ( DataMean):

        elif (self.DatasetName == 'MNIST'):
            imgs = [item[0] for item in self.train_data]  # item[0] and item[1] are image and its label
            imgs = torch.stack(imgs, dim=0).numpy()
            
            # calculate mean 
            mean = imgs[:,:,:].mean()

            return mean

and std ( DataStd):

        elif (self.DatasetName == 'MNIST'):
            imgs = [item[0] for item in self.train_data] # item[0] and item[1] are image and its label
            imgs = torch.stack(imgs, dim=0).numpy()
            
            # calculate std
            std = imgs[:,:,:].std()
 
            
            return(std)

What am I doing wrong? How could I fx it?

ptrblck · May 19, 2023, 4:30pm

I don’t know as your code is not executable, but this works for me:

dataset = datasets.MNIST(root="~/python/data", download=False)

imgs = dataset.train_data
            
# calculate mean 
mean = imgs.float().mean()
std = imgs.float().std()

transform = transforms.Compose([
    transforms.ToTensor(),
    transforms.Normalize(mean=(mean,), std=(std,)),
])
dataset = datasets.MNIST(root="~/python/data", download=False, transform=transform)
loader = DataLoader(dataset, batch_size=2)

for data, target in loader:
    print(data.shape)
    print(target)

Torcione · May 19, 2023, 7:32pm

Thank you for the answer!

The first time you define the dataset (for the mean and std computation) shouldn’t you include

transform = transforms.ToTensor()

in order to compute the two quantity on the same scale that we use later for the “real” dataset ([0,1])?

ptrblck · May 19, 2023, 8:23pm

Yes, it’s indeed missing and you could add the transformation but would then need to iterate the dataset.
Alternatively, you could also normalize the already calculated mean and std.

Torcione · May 19, 2023, 8:29pm

Why would I need to iterate the dataset in this case?

Thank you a lot for your answer, it make me realize that probably the problem, in my case, is not in the normalization step but probably in the moment I define different dataloaders for different classes, splitting the dataset. I’m trying to understand where exactly the problem arise as using cifar10 (with exactly the same splitting logic) I don’t get any error

ptrblck · May 19, 2023, 8:31pm

Because the internal .train_data won’t be directly transformed.
Each sample will be transformed in the __getitem__ on-the-fly using the passed transformation while the internal .train_data attribute will still hold the original data.

Torcione · May 19, 2023, 8:35pm

I see! so for cifar10 this does not represent a problem as the dimension keep the same before and after transformation, while for MNIST I got an extra channel dimension, right?

But why is this the case only when I split the dataset on multiple dataloader and not when I keep a single dataloader for the whole dataset?

ptrblck · May 19, 2023, 8:37pm

It’s still unclear what exactly is failing in your code as my code snippets work as you can see.
The MNIST dataset has a single channel and Normalize thus expects stats with a single value, too.

Torcione · May 20, 2023, 6:51pm

Thank you for the explanation; as I understood that the problem may not be in the normalization step I created a more detailed post where I provide the whole pipeline (Code extention from CIFAR10 to MNIST not working)