How to use normalize numbers of three channels for one channel?

Hi,

I have a function (extract_single) that extracts features from train images:

prnu = transforms.Compose([
    transforms.Resize((resize , resize)),
    transforms.Lambda(extract_single),
    transforms.ToTensor(),
    transforms.Normalize([0.4786, 0.4728, 0.4528], [0.2425, 0.2327, 0.2564]),
])

The result of transforms.Lambda(extract_single) is numpy file with one channel, while images are three channels.
I have already obtained the normalized numbers for the images. But it is not applicable to one channel. Now I don’t know which number of [0.4786, 0.4728, 0.4528], [0.2425, 0.2327, 0.2564] should be used or I need to calculate another mean & std for the one channel?

This might be a valid approach, but it would also be interesting to know what exactly extract_single does as you might be able to reuse the same approach for the stats.
E.g. if extract_single slices the tensor in a single channel you could use the corresponding value from the mean and std stats, too.

1 Like

extract_single function extracts prnu noise from images:

def extract_single(im: np.ndarray,
                   levels: int = 4,
                   sigma: float = 5,
                   wdft_sigma: float = 0) -> np.ndarray:
    """
    Extract noise residual from a single image
    :param im: grayscale or color image, np.uint8
    :param levels: number of wavelet decomposition levels
    :param sigma: estimated noise power
    :param wdft_sigma: estimated DFT noise power
    :return: noise residual
    """

    W = noise_extract(im, levels, sigma)
    W = rgb2gray(W)
    W = zero_mean_total(W)
    W_std = W.std(ddof=1) if wdft_sigma == 0 else wdft_sigma
    W = wiener_dft(W, W_std).astype(np.float32)

    return W

I want to do something like this (passing normalized image to the single_extract function to extract noises (features)):

my_transform = transforms.Compose([
    transforms.Resize((128 , 128)),
    transforms.ToTensor(),
    transforms.Normalize([0.4786, 0.4728, 0.4528], [0.2425, 0.2327, 0.2564]),
    transforms.ToPILImage(),
    transforms.Lambda(noise_extract),
    transforms.ToTensor()
])

It works, but I think this way causes increasing processing because of 2 times ToTensor() and because of the addition of ToPILImage(). Is there a better way?