What does it mean to normalize images for Resnet?

Luke_Kollmorgen · September 14, 2020, 12:27am

I’m using a pretty simple set of steps designed to prepare images for feature extraction from a pre trained resnet 152 model.


img=Image.open("Documents/img.png")

# Load the pretrained model
model = models.resnet152(pretrained=True)

# Use the model object to select the desired layer
layer = model._modules.get('fc')

# Set model to evaluation mode
model.eval()

transformations = torchvision.transforms.Compose([
    torchvision.transforms.Resize(256),
    torchvision.transforms.CenterCrop(224),
    torchvision.transforms.ToTensor(),
    torchvision.transforms.Normalize(mean=[0.485, 0.456, 0.406], std=[0.229, 0.224, 0.225]),
])
t_img = transformations(img)


What I'm trying to understand is what it means to normalize the images with those means and standard deviations? We obviously aren't setting pixel values right? My understanding is that pixel values should be integers between 0 and 255, so what does it mean to normalize in this context?

ebarsoum · September 14, 2020, 12:38am

Normalize in the above case, mean subtract the mean from each pixel and divide the result by the standard deviation. The input image is float not integer in the range of [0, 1]. So when you load the image, you need to divide it by 255.

Sunshine352 · September 14, 2020, 9:05am

[quote=“Luke_Kollmorgen, post:1, topic:96160”]
torchvision.transforms.ToTensor()
[/quote] can transform the pixel range from [0, 255] to [0, 1]

zia_badar · June 5, 2022, 4:06pm

but why a constant value of mean=[0.485, 0.456, 0.406], std=[0.229, 0.224, 0.225] shouldn’t each image be normalized to 0 mean and 1 standard deviation, or is this mean and std is calculated from the whole dataset?

TigerYan86 · June 21, 2023, 5:31pm

I have the same question. How are these values determined?

ptrblck · June 21, 2023, 6:36pm

These stats are calculated from the ImageNet training dataset.