I think in the other post by @ptrblck, he is computing the mean and std over the pixels not over samples in the batch. So, then that code in About Normalization using pre-trained vgg16 networks is correct, since the goal is to compute the mean and std for each batch and then take the average of these two quantities over the entire dataset.
3 Likes