Normalization image for using pretrain model

Hello all, I am using resnet-50 pretrain model from pytorch vision. Before using the pretrained model, my input data is as below for training from scratch

input = torch.from_numpy(image.transpose((2,0,1))).float().div(255)

For using pretrain model, I have to follow the normalization method as pytorch did, especially, my code is

    input = torch.from_numpy(image).double()
    mean = np.array([0.485, 0.456, 0.406])
    mean = torch.from_numpy(mean)
    std = np.array([0.229, 0.224, 0.225])
    std = torch.from_numpy(std)
    input = std * input + mean
    input = np.clip(input, 0, 1)

Is the above normalization way correct for input of resnet 50 pretrained model? I found that it take a lot of time in comparsion with my first way (div by 255)

You should subtract the mean and divide by the std.
Have a look at this formula in torchvision.transforms.Normalize.
Could you try that and report, if it helps training?

For normalization I would do as @ptrblck suggests.
If you just need to transfer your values between 0 and 1 encoding for a black and white image for example, dividing by 255 is enough I guess.

Thanks so much for your reply. @ptrblck: how can i use it for my case? My input is numpy array as the first way.

If you have a PIL.Image, you could do the following:

normalize = transforms.Normalize(mean=[0.485, 0.456, 0.406],
                                 std=[0.229, 0.224, 0.225])

transform=  transforms.Compose([
    transforms.ToTensor(),
    normalize,
])

In that way, you don’t have to normalize the data yourself.

how to use the method? suppose I have a tensor x, so x.transforms.Normalize(mean,std) or transforms.Normalize(x, mean, std) or other way?

Hey,

The simplest way to use the Normalize method is how @ptrblck suggested.

normalize = transforms.Normalize(mean=[0.485, 0.456, 0.406],
                                 std=[0.229, 0.224, 0.225])

transform=  transforms.Compose([
    transforms.ToTensor(),
    normalize,
])

but you then pass the transform to your custom Dataset as a parameter and after getting the image you
just apply the transform on it. here is a simple example:

class CustomDataset(data.Dataset):
    def __init__(self, transforms=None):
        self.transforms = transforms

    def get_image(self, index):
        raise implement!

    def __getitem__(self, index):
        image, target = self.get_image(index)

        #perform transform
        if self.transforms is not None:
            image = self.transforms(image)

        return image, target

    def __len__(self):
        return len(self.dataset)
1 Like

the image is pil object or tensor or array?

the Image is a PIL Image. If i’m not mistaken the transforms will turn it to Tensor first and then do the Normalize.

I notice that the Totensor Converts a PIL.Image or numpy.ndarray (H x W x C) in the range
[0, 255] to a torch.FloatTensor of shape (C x H x W) in the range [0.0, 1.0].
but My array has been float and not in the [0,255]. for example, I have implemented rgb2lab and the results are float and not in the [0,255]. so how to carry out the Totensor or Normalize method?

I haven’t used rgb2lab in my experiments.

From what I understand opencv has a rgb2lab conversion which can help you a lot.

The documentation is here: optncv.cvtcolor documentation

If you scroll down a bit you can see :
RGB <> CIE Lab* ( CV_BGR2Lab, CV_RGB2Lab, CV_Lab2BGR, CV_Lab2RGB )

from a brief reading I noticed that the lab channels are converted to the following value range:

So I would normalize each channel accordingly.

Hope this helps a bit.

thank you, l will look it up :hugs: