I get a tensor of 600 values instead of 3 values for mean and std

I am trying to Normalize my images data and for that I need to find the mean and std for train_loader.

mean = 0.0
std = 0.0
nb_samples = 0.0
for data in train_loader:
    images, landmarks = data["image"], data["landmarks"]
    batch_samples = images.size(0)
    images_data = images.view(batch_samples, images.size(1), -1)
    mean +=  torch.Tensor.float(images_data).mean(2).sum(0)
    std += torch.Tensor.float(images_data).std(2).sum(0)
    ###mean += images_data.mean(2).sum(0)
    ###std += images_data.std(2).sum(0)
    nb_samples += batch_samples

mean /= nb_samples
std /= nb_samples

the mean and std here are each a torch.Size([600])

When I tried (almost) same code on dataloader, it worked as expected:

# code from https://discuss.pytorch.org/t/about-normalization-using-pre-trained-vgg16-networks/23560/6?u=mona_jalal
mean = 0.0
std = 0.0
nb_samples = 0.0
for data in dataloader:
    images, landmarks = data["image"], data["landmarks"]
    batch_samples = images.size(0)

    images_data = images.view(batch_samples, images.size(1), -1)
    mean += images_data.mean(2).sum(0)
    std += images_data.std(2).sum(0)
    nb_samples += batch_samples

mean /= nb_samples
std /= nb_samples

and I got:
mean is: tensor([0.4192, 0.4195, 0.4195], dtype=torch.float64), std is: tensor([0.1182, 0.1184, 0.1186], dtype=torch.float64)

So my dataloader is:

class MothLandmarksDataset(Dataset):
    """Face Landmarks dataset."""

    def __init__(self, csv_file, root_dir, transform=None):
        """
        Args:
            csv_file (string): Path to the csv file with annotations.
            root_dir (string): Directory with all the images.
            transform (callable, optional): Optional transform to be applied
                on a sample.
        """
        self.landmarks_frame = pd.read_csv(csv_file)
        self.root_dir = root_dir
        self.transform = transform

    def __len__(self):
        return len(self.landmarks_frame)

    def __getitem__(self, idx):
        if torch.is_tensor(idx):
            idx = idx.tolist()

        img_name = os.path.join(self.root_dir, self.landmarks_frame.iloc[idx, 0])
        image = io.imread(img_name)
        landmarks = self.landmarks_frame.iloc[idx, 1:]
        landmarks = np.array([landmarks])
        landmarks = landmarks.astype('float').reshape(-1, 2)
        sample = {'image': image, 'landmarks': landmarks}

        if self.transform:
            sample = self.transform(sample)

        return sample

transformed_dataset = MothLandmarksDataset(csv_file='moth_gt.csv',
                                           root_dir='.',
                                           transform=transforms.Compose(
                                               [
                                               Rescale(256),
                                               RandomCrop(224),
                                               
                                               ToTensor()      
                                               ]
                                                                        )
                                           )



dataloader = DataLoader(transformed_dataset, batch_size=3,
                        shuffle=True, num_workers=4)

and train_loader is:

# Device configuration
device = torch.device('cuda:0' if torch.cuda.is_available() else 'cpu')
seed = 42
np.random.seed(seed)
torch.manual_seed(seed)

# split the dataset into validation and test sets
len_valid_set = int(0.1*len(dataset))
len_train_set = len(dataset) - len_valid_set

print("The length of Train set is {}".format(len_train_set))
print("The length of Test set is {}".format(len_valid_set))

train_dataset , valid_dataset,  = torch.utils.data.random_split(dataset , [len_train_set, len_valid_set])

# shuffle and batch the datasets
train_loader = torch.utils.data.DataLoader(train_dataset, batch_size=8, shuffle=True, num_workers=4)
test_loader = torch.utils.data.DataLoader(valid_dataset, batch_size=8, shuffle=True, num_workers=4)

Please let me know if more information is needed.

I basically need to get 3 values for mean of train_loader and 3 values for std of train_loader to use as args for Normalize.

images_data in dataloader is torch.Size([3, 3, 50176]) inside the loop and images_data in train_loader is torch.Size([8, 600, 2400])

8 in torch.Size([8, 600, 2400]) for train_loader refers to the batch_size I have set for trainset

Most probably this is a very dumb answer but seems to do the job. Could someone please verify if this answer makes sense?

I also used a little less dumb method (the values are slightly different) - Does this approach sound reasonable?

Hi, I’m sure this can be done using pytorch as well (without the use of numpy).

Can you please check if the shape of images_data is same in both the code snippets for train_loader and data_loader i.e.,

mean = 0.0
std = 0.0
nb_samples = 0.0
for data in train_loader:
    images, landmarks = data["image"], data["landmarks"]
    batch_samples = images.size(0)
    images_data = images.view(batch_samples, images.size(1), -1)  #HERE!
    mean +=  torch.Tensor.float(images_data).mean(2).sum(0)
    std += torch.Tensor.float(images_data).std(2).sum(0)
    ###mean += images_data.mean(2).sum(0)
    ###std += images_data.std(2).sum(0)
    nb_samples += batch_samples

mean /= nb_samples
std /= nb_samples

and

mean = 0.0
std = 0.0
nb_samples = 0.0
for data in dataloader:
    images, landmarks = data["image"], data["landmarks"]
    batch_samples = images.size(0)

    images_data = images.view(batch_samples, images.size(1), -1)  #HERE!
    mean += images_data.mean(2).sum(0)
    std += images_data.std(2).sum(0)
    nb_samples += batch_samples

mean /= nb_samples
std /= nb_samples

no it is not dataloader is the entire dataset and train_loader is just the train set. My focus is on train_loader

When I do that I get a tensor of 600 values for mean and same for std. I only should have 3 values for each

Yes, but do both shapes look the same i.e., are both in the format such as [n_images, height, width, channels] where the value of n_images varies between data_loader and train_loader?

Can you please tell us the shapes of images_data in data_loader and train_loader? This is because we can reproduce your error by generating random data of those shapes. Thank you.