I saw an example here How make customised dataset for semantic segmentation? of how to do a dataset but I must admit that I still have doubts.
Basically I have PNG image but already labelled with index 0 , 1 , 2, 3 , It’s a costum pallette that I did so techically I should avoid the mapping as in the example or I’m wrong?
self.mapping = {
0: 0,
255: 1
}
def mask_to_class(self, mask):
for k in self.mapping:
mask[mask==k] = self.mapping[k]
return mask
Do you have a snippet of a code that can create the class indices in the shape [batch_size, height, width]
as you said?
Currently I was trying just for one class to apply the mask = torch.from_numpy(mask)
but i’m not sure if it’s ok.
def transforms(self, image, mask):
#img = img.resize((wsize, baseheight), PIL.Image.ANTIALIAS)
image = image.resize((64, 64), PIL.Image.NEAREST)
mask = mask.resize((64, 64), PIL.Image.NEAREST)
mask = np.array(mask)
image = TF.to_tensor(image)
mask = torch.from_numpy(mask)
return [image, mask]
Now the mask is actually between zero and one as it should be (the problem for more
indices I still need to solve it of course) but once I try to train I get this:
ValueError: Target size (torch.Size([1, 64, 64])) must be the same as input size (torch.Size([1, 1, 64, 64]))
I tried to add in the calculation of the loss: target_ = torch.empty(batch_size, 1,64,64) target = target_.to(device)
def calc_loss(pred, target, metrics, bce_weight=0.5):
target_ = torch.empty(batch_size, 1,64,64)
target = target_.to(device)
bce = F.binary_cross_entropy_with_logits(pred, target)
pred = F.sigmoid(pred)
dice = dice_loss(pred, target)
This does not crash but I have Nan as output in the loss. So yes I guess i’m quite confused and trying to read as much but cannot figure out a lot at the moment so whatever help would be great thanks again!