In my case, I have CT images with min and max Hounsfield intensity values of -1024 and 3597 which I scaled to be within [0,1]. From the discussion so far, I realized that there is a need to normalize for better performance. My question is, do I need to do this for the validation and testing dataset? If yes, can I use the computed mean and standard deviation of the training dataset or I compute separately for the validation and testing dataset?
how can you get the values in tensor all in range of [0, 1] by using image = torch.randint(0,255,(5, 5, 3), dtype=torch.uint8). Doesn’t that produce values in range [0,255]? It does not seem to be right.
The normalization is usually applied on the images. What if we have masks associated with the images as well in that case the normalization is getting applied on the mask images and I get the following error:
RuntimeError: output with shape [1, 512, 512] doesn’t match the broadcast shape [3, 512, 512]
This is because my masks are in grayscale but the images are in RGB. I want to apply the same transformations on both the images as well as the masks except the Normalization. Any help in this regard will be highly appreciated. Thanks
The error is raised, since the number of channels in the mask doesn’t match the provided mean and std in the Normalization transformation.
You could apply the Normalization only to the data tensor and skip it for the mask.
E.g. in case you are passing a transform object to the Dataset, remove the Normalize transformation from it and either apply it inside the Dataset, if you are using a custom Dataset implementation, or check if your current Dataset accepts a target_transform argument.
In getting the mean and standard deviation of the training dataset, is the computation of np.mean(training_data_array) and np.std(training_data_array) the same as the batched mean and statistics from your dataloader?
If not, how will I get the batch mean and standard deviation in the dataloader when the same transform.Normalize() is mine dataloader?
The reason of code transforms.Normalize((0.5, 0.5, 0.5), (0.5, 0.5, 0.5) is to convert range from [0;1] to [-1, 1]. Because your image is loaded from PILImage, it is in range of [0, 1] by defautl. In here, author of the code want to make use of standardization formular for adjusting the range of value. However, the purpose of this code is not standardization. For example,
If you have a pixel with value 0, its conversion will be:
(0-0.5)/ 0.5 = -1
If you have a pixel with value 1, its conversion will be:
(1-0.5)/0.5 = 1
Remind about standardization formular:
(data_point_value - mean) / std
By putting mean =0.5 and std = 0.5, we can make use of existing transform function of pytorch for conversion.
Hi @ptrblck, could you please answer me?
Images’ type after applying transformed.Normalize() become float. My masks are binary images( 0 and 1,uint8). Should I change the masks’ type to float as well, even though they are only 0 and 1 matrix? Also, in general, which type is better in image processing for image classification, segmentation or detection? Float or int?
Yes, that’s expected as the default input type is float32 and Normalize creates input tensors with a zero mean and a unit variance.
No, this sounds wrong. I assume you want to use your mask in a multiplication to “mask” specific values and are thus depending on the 1s and 0s. In this case, don’t use Normalize on it.
If that’s not the case, let me know how you are using these masks in your model.
As often, it depends on your use case. If you are purely working on image processing you might want to keep the image in an integer type and manipulate it directly. However, if you want to train e.g. a neural network using these images, floating point inputs are usually the way to go since (all) math ops in your model will be using floating point tensors so that you can train them. Also, normalizing the inputs usually helps in the model training.