Normalization for uint16 images

Hello, I am trying to normalize my images which are of type uint16. But somehow the values are not in the range (0,1). The sample code has been attached. Any help is appreciated. Thanks

class CustomDataset(Dataset):
def init(self, image_paths, target_paths, train=True): # initial logic happens like transform

    self.image_paths = image_paths
    self.target_paths = target_paths
    self.transforms  = transforms.Compose([
transforms.ToTensor(),transforms.Normalize((691.0994607113063, 891.5849978397365, 1019.7352398111965,3172.54439667618),(238.88659552814724,267.57694692565684,347.344288531034,689.5044531578629))])
    self.transforms2 =transforms.Compose([transforms.ToTensor()])

def __getitem__(self, index):

    img = np.moveaxis(img,0,-1)
    img = img.astype(np.float64)
    #img = Image.fromarray(img,'RGBA')
   # img.verify()
    mask = gdal.Open(self.target_paths[index],gdal.GA_ReadOnly)
    mask = mask.ReadAsArray()
    mask = to_categorical(mask, 14)
    mask =np.expand_dims(mask,axis=0)
    mask = np.moveaxis(mask,0,-1)

    mask = mask.astype(np.float64)
    mask1 = np.zeros((250,250,14),dtype=np.float64)
    for i in range(14):
    t_image = self.transforms(img)
    t_masks = self.transforms2(mask1)
    return t_image, t_masks

def __len__(self):  # return count of sample we have

    return len(self.image_paths)

train_dataset = CustomDataset(X_train, Y_train, train=True)
train_loader =, batch_size=8, shuffle=True, num_workers=1)

1 Like

transforms.ToTensor() should already normalize your tensor to [0, 1].
transforms.Normalize will subtract the mean and divide by the standard deviation, such that your tensors should have a zero mean and a stddev of 1.
If you just want to have tensors in [0, 1], remove the transforms.Normalize transformation and check the values of your tensors.

only using transforms.ToTensor() is resulting the same value of the image.

In that case torchvision probably doesn’t recognize the image format and you would have to normalize it yourself. For standard uint8 images, each values is divided by 255.
If your images use the complete uint16 range, you could divide by 65535.

Ok. but will it be a problem to just divide by 65535, it may be case that the max range may be till 4000 for some images?

I guess it depends on the distribution of the values for all images.
You could clip some “outliers”, i.e. very high (or low) values or normalize each image separately using its own min/max (if that doesn’t destroy the information e.g. as would be the case for depth images). What kind of images are you using?

Satellite images, having 4 channels RGB and Infrared. Currently I have normalised each image separately.