I am working on semantic segmentation task and I have to make a customized dataset.
The images are 24-bit per pixel
and the masks are 8-bit per pixel
.
My customized dataset as follows:
lass MyDataset(Dataset):
def __init__(self, root, set_name,):
super(MyDataset, self).__init__()
assert set_name == 'train' or set_name == 'val' or set_name == 'test'
self.root = root
self.set_name = set_name
self.image_list = glob.glob(os.path.join(
root,
set_name,
args.images_folder,
"*.tif",
))
self.label_list = glob.glob(os.path.join(
root,
set_name,
args.labels_folder,
"*.tif",
))
def __getitem__(self, index):
images = Image.open(self.image_list[index])
masks = Image.open(self.label_list[index])
t_images = TF.to_tensor(images)
t_masks = TF.to_tensor(masks)
return t_images, t_masks
def __len__(self):
return len(self.image_list)
But it will occur an error like this:
RuntimeError: invalid argument 0: Sizes of tensors must match except in dimension 0. Got 3 and 1 in dimension 1 at /pytorch/aten/src/TH/generic/THTensorMath.cpp:3616
I think it because the masks only have a single channel
so I change the code:
images = Image.open(self.image_list[index]).convert('RGB')
masks = Image.open(self.label_list[index]).convert('RGB')
Then it works, but I have to reshape the masks when I feed data and target
to the network.
So I want to know is there any solution that can avoid change masks format when reading the dataset, I do not know whether it will have bad effects when I use operations above.
Thanks in advance