About torchvision.transforms.ToTensor for segmentaion task

hi,

 `torchvision.transforms.` `ToTensor`

Convert a PIL Image or numpy.ndarray to tensor.

Converts a PIL Image or numpy.ndarray (H x W x C) in the range [0, 255] to a torch.FloatTensor of shape (C x H x W) in the range [0.0, 1.0]
why need use ToTensor for images of segmentation task inputs? Is this related to the loss function?
thanks!

Normalized input (to [0, 1] or z-score) is usually beneficial for your training, i.e. your model might train faster and better.
However, in a segmentation use case you should be careful using ToTensor on your mask image, since this might destroy the labels.

I need build two transform.Compose?

img_transform = transform.Compose([
            transform.RandomRotation(2),
            transform.ToTensor(),
            transform.Normalize([.485, .456, .406], [.229, .224, .225])]) 
label_transform = transform.Compose([
            transform.RandomRotation(2),
            ]) 

How can I guarantee the same angle of rotation?

For a segmentation use case, you should use the functional API of torchvision.transforms as shown here.

Transform to tensor

    image = TF.to_tensor(image)
    mask = TF.to_tensor(mask)

    transform.ToTensor()

Have the same effect?

TF.to_tensor(mask) this might destroy the labels ?

thanks!

Both methods have the same effect.

Yes, depending on the format of your target image.
E.g. if your target image is a color image where each different color encodes a class, you should rather create a mapping between colors and classes. ToTensor will normalize the image such that the values will be in [0, 1] which won’t be suitable to use in e.g. nn.CrossEntropyLoss anymore, as it expects class indices in the range [0, nb_classes].