MNIST iterator not working after normalization

After defining the transformation

            transform = transforms.Compose([
                        transforms.ToTensor(), #NOTE: Converts a PIL Image or numpy.ndarray (H x W x C) in the range [0, 255] to a torch.FloatTensor of shape (C x H x W) in the range [0.0, 1.0]
                        transforms.Normalize((DataMean,), (DataStd,))]) 

where DataMean DataStd are mean and std computed over the train data
I define the train dataset

            train_data = datasets.MNIST(root = ./data, train = True, download = True, transform = transform)

and build the dataloader (for my problem I need a separate dataloader for each class “i”)

TrainDL[i] = torch.utils.data.DataLoader(train_data, batch_size = 100, 
                                                       sampler = train_sampler, num_workers = 0)     

Later in the code I want to iterate over the dataloaders, namely

        for EvalKey in TrainDL:
            for dataval,labelval in TrainDL[EvalKey]:

Here I get the following error

ValueError: Too many dimensions: 3>2

I read here that this is due to the Normalization transformation. Is not clear to me why the third dimension should cause this problem (CIFAR10 also has 3 channels and works perfectly fine). Is it due to a difference between the expected dimension of get item and thart you got after initialization? what is the easiest way to solve the problem?

MNIST images have a single channel only so normalizing these samples with 3 stats values will fail.

sorry I meant 3 dimensions (I corrected my question), referring to ,your answer from the post in the link, i.e.:

img will have 3 dimensions ([1, 28, 28]), which will yield this error.

In my case for the normalization I defined two functions to compute directly mean and std on the MNIST dataset ( DataMean):

        elif (self.DatasetName == 'MNIST'):
            imgs = [item[0] for item in self.train_data]  # item[0] and item[1] are image and its label
            imgs = torch.stack(imgs, dim=0).numpy()
            
            # calculate mean 
            mean = imgs[:,:,:].mean()

            return mean    

and std ( DataStd):

        elif (self.DatasetName == 'MNIST'):
            imgs = [item[0] for item in self.train_data] # item[0] and item[1] are image and its label
            imgs = torch.stack(imgs, dim=0).numpy()
            
            # calculate std
            std = imgs[:,:,:].std()
 
            
            return(std)            
            

What am I doing wrong? How could I fx it?

I don’t know as your code is not executable, but this works for me:

dataset = datasets.MNIST(root="~/python/data", download=False)

imgs = dataset.train_data
            
# calculate mean 
mean = imgs.float().mean()
std = imgs.float().std()

transform = transforms.Compose([
    transforms.ToTensor(),
    transforms.Normalize(mean=(mean,), std=(std,)),
])
dataset = datasets.MNIST(root="~/python/data", download=False, transform=transform)
loader = DataLoader(dataset, batch_size=2)

for data, target in loader:
    print(data.shape)
    print(target)

Thank you for the answer!

The first time you define the dataset (for the mean and std computation) shouldn’t you include

transform = transforms.ToTensor()

in order to compute the two quantity on the same scale that we use later for the “real” dataset ([0,1])?

Yes, it’s indeed missing and you could add the transformation but would then need to iterate the dataset.
Alternatively, you could also normalize the already calculated mean and std.

Why would I need to iterate the dataset in this case?

Thank you a lot for your answer, it make me realize that probably the problem, in my case, is not in the normalization step but probably in the moment I define different dataloaders for different classes, splitting the dataset. I’m trying to understand where exactly the problem arise as using cifar10 (with exactly the same splitting logic) I don’t get any error

Because the internal .train_data won’t be directly transformed.
Each sample will be transformed in the __getitem__ on-the-fly using the passed transformation while the internal .train_data attribute will still hold the original data.

I see! so for cifar10 this does not represent a problem as the dimension keep the same before and after transformation, while for MNIST I got an extra channel dimension, right?

But why is this the case only when I split the dataset on multiple dataloader and not when I keep a single dataloader for the whole dataset?

It’s still unclear what exactly is failing in your code as my code snippets work as you can see.
The MNIST dataset has a single channel and Normalize thus expects stats with a single value, too.

Thank you for the explanation; as I understood that the problem may not be in the normalization step I created a more detailed post where I provide the whole pipeline (Code extention from CIFAR10 to MNIST not working)