Abnormal Mean and standard deviation values for image dataset

I tried calculating the mean and standard deviation for an image dataset and I am getting very high values. If I am not wrong the values should be between 0 and 1.

def mean_std(loader):
    mean = 0.0
    std = 0.0
    total_images_count = 0
    for images, _ in loader:
        image_count_in_a_batch = images.size(0)
        images = images.view(image_count_in_a_batch, images.size(1), -1)
        mean += (images * 1.0).mean(2).sum(0)
        std += (images * 1.0).std(2).sum(0)
        total_images_count += image_count_in_a_batch
    mean /= total_images_count
    std /= total_images_count
    return mean, std

(tensor([112.1058, 126.2224,  79.7943]), tensor([49.5487, 50.0356, 43.0908]))

I can’t seem to find the mistake in this. Any help would be appreciated!

If you are using normalized inputs in this range, then the mean should be expected in this range, too.
However, based on your values I would guess that your inputs might be raw pixel values in [0, 255].

1 Like

Since I am using albumentation library, I had to specify the max_pixel_value in the normalize function. It’s working fine now. Thank you!

I have a question here.
When I want to take the mean and standard deviation value of my dataset, should I take all the data, or do I just have to take the training part?

The Ideal protocol is to use only the training data to calculate the mean and std.

1 Like

and the use this in normalization of all parts?

yes, you are right. Use the mean, std from the training set to normalize validation, test, dev sets of data, if any.

1 Like

thank you for your explination