Torchvision transform for segmentation masks

I am trying to train a segmentation model, so I have a pairs of grayscale image and its mask. To convert these into tensors, I am using torchvision transforms, i.e.

from torchvision import transforms

For the grayscale image

img_transform = transforms.Compose([transforms.ToPILImage(), transforms.ToTensor()])    
img = img_transform(img)

which converts my img to a tensor of dtype torch.float32
and for the masks

But I cant use the same transform on the mask as the mask cant have float values. It needs to have int values representing the classes…

Pls help me with the transform that I could use to convert my masks to tensors without converting the mask values to float.

Thanks in advance… :slight_smile:

You could load the mask inside your Dataset's __getitem__ via PIL.Image.open and convert it to a tensor manually using mask = torch.from_numpy(mask). This would make sure to keep the dtype.

yes thats what i tried. i used cv2.imread to read the images and converted the images to tensor using

torch.tensor(numpy_array) 

it seems like the correct approach.
Thanks for replying :slight_smile:

1 Like