Re-normalizing images

Nikronic · November 4, 2019, 3:41pm

Hi,

Yes we compute the mean and std channel wise.

And about computing these values, I have used this code gathered and edited from community for online method. Actually, in my case, my data was about 20GB and I could not load whole dataset into memory so I needed to compute std and mean batch wise and then accumulate it over all batches in epoch.

github.com

Nikronic/CoarseNet/blob/master/utils/preprocess.py#L142-L200


class OnlineMeanStd:
    def __init__(self):
        pass


    def __call__(self, dataset, batch_size, method='strong'):
        """
        Calculate mean and std of a dataset in lazy mode (online)
        On mode strong, batch size will be discarded because we use batch_size=1 to minimize leaps.


        :param dataset: Dataset object corresponding to your dataset
        :param batch_size: higher size, more accurate approximation
        :param method: weak: fast but less accurate, strong: slow but very accurate - recommended = strong
        :return: A tuple of (mean, std) with size of (3,)
        """


        if method == 'weak':
            loader = DataLoader(dataset=dataset,
                                batch_size=batch_size,
                                shuffle=False,
                                num_workers=1,

This file has been truncated. show original

Note that this approach is not accurate but between two implemented approaches, strong gives more accurate answer in respect of longer run time.
If you can load entire dataset into memory, you do not need any approximation and the approach would be different.

Ref: About Normalization using pre-trained vgg16 networks

Bests
Nik