Multiclass Segmentation

Sorry, forget the point about the necessity of one-hot encoded targets. It was just a small correction about the values in the target, which do not have to be strictly zeros or ones, but could take any value in-between. Since clearly you are not using these “soft labels”, you can just create one-hot encoded targets using the functional API.

Dice loss should work perfectly if the target mask is one-hot encoded(combination of 0 and 1s in all channels).

Hi! Could you please clarify what is difference between output and target? [N, nb_classes, H, W] is output of my Unet model and I have an error “Expected object of scalar type Long but got scalar type Float” in running DataLoader.

“Output” refers to the model output while “target” refers to the ground truth tensor containing the target class indices for your segmentation use case. Based on the error message your target is not the expected LongTensor, so call target = target.long() on it before passing it to the criterion and make sure it has the expected shape ([batch_size, height, width] based on the posted model output shape).

Thank you very much! Now it is ok

Maybe you can refer this notebook, which I implemented based on this discussion

Hi,
I followed the previous answers so I tried to create my dataset for multiclass segmentation:

class MyDataset (Dataset):
    def __init__(self, images, masks, mean, std, transforms=None):
        self.images = images
        self.masks = masks
        self.transforms = transforms
        self.mean = mean
        self.std = std
        
    def mask_to_class(self,mask):
        target = torch.from_numpy(mask)
        h,w = target.shape[0],target.shape[1]
        masks = torch.empty(h, w, dtype=torch.long)
        colors = torch.unique(target.view(-1,target.size(2)),dim=0).numpy()
        target = target.permute(2, 0, 1).contiguous()
        mapping = {tuple(c): t for c, t in zip(colors.tolist(), range(len(colors)))}
        for k in mapping:
            idx = (target==torch.tensor(k, dtype=torch.uint8).unsqueeze(1).unsqueeze(2))
            validx = (idx.sum(0) == 3) 
            masks[validx] = torch.tensor(mapping[k], dtype=torch.long)
        return masks
    
    def __len__(self):
        return len(self.images)
    
    def __getitem__(self, idx):
        image_path = self.images[idx]
        image = Image.open(image_path)
        image = np.array(image)
        mask_path = self.masks[idx]
        mask = Image.open(mask_path)
        mask = np.array(mask)
        
        if self.transforms:
            aug = self.transforms(image=image, mask=mask)
            image = T.ToPILImage()(aug['image'])
            mask = T.ToPILImage()(aug['mask'])
        else:
            image = T.ToPILImage()(image)
            mask = T.ToPILImage()(mask)
        
        t = T.Compose([T.Resize(256), T.ToTensor(), T.Normalize(self.mean, self.std)])
        tm = T.Resize(256)
        
        image = t(image)
        mask = tm(mask)
        mask = self.mask_to_class(np.array(mask))
        
        return image, mask

When I do fit(..), I get this error:

/usr/local/src/pytorch/aten/src/THCUNN/SpatialClassNLLCriterion.cu:105: cunn_SpatialClassNLLCriterion_updateOutput_kernel: block: [3,0,0], thread: [96,0,0] Assertion `t >= 0 && t < n_classes` failed.
.
.
RuntimeError: cuDNN error: CUDNN_STATUS_NOT_INITIALIZED

Could you tell me why? Maybe it’s about “mask_to_class” function?

Check if you are running out of memory and if so, reduce e.g. the batch size of the training.

1 Like

I tried to significantly reduce the batch size and the dimension of the images. Now I get a slightly different error:

---------------------------------------------------------------------------
RuntimeError                              Traceback (most recent call last)
/tmp/ipykernel_33/3410204218.py in <module>
      8                                             steps_per_epoch=len(train_dlr))
      9 
---> 10 history = fit(epoch, model, train_dlr, valid_dlr, criterion, optimizer, sched)

/tmp/ipykernel_33/2881311107.py in fit(epochs, model, train_loader, val_loader, criterion, optimizer, scheduler)
     29             #TODO
     30             #backward
---> 31             loss.backward()
     32             optimizer.step() #update weight
     33             optimizer.zero_grad() #reset gradient

/opt/conda/lib/python3.7/site-packages/torch/_tensor.py in backward(self, gradient, retain_graph, create_graph, inputs)
    253                 create_graph=create_graph,
    254                 inputs=inputs)
--> 255         torch.autograd.backward(self, gradient, retain_graph, create_graph, inputs=inputs)
    256 
    257     def register_hook(self, hook):

/opt/conda/lib/python3.7/site-packages/torch/autograd/__init__.py in backward(tensors, grad_tensors, retain_graph, create_graph, grad_variables, inputs)
    147     Variable._execution_engine.run_backward(
    148         tensors, grad_tensors_, retain_graph, create_graph, inputs,
--> 149         allow_unreachable=True, accumulate_grad=True)  # allow_unreachable flag
    150 
    151 

RuntimeError: cuDNN error: CUDNN_STATUS_MAPPING_ERROR


Update:
I have tried to resize images and masks by:

image = Image.open(image_path).resize((64,64))
mask = Image.open(mask_path).resize((64,64))

Now the error is

RuntimeError: CUDA error: device-side assert triggered
CUDA kernel errors might be asynchronously reported at some other API call,so the stacktrace below might be incorrect.
For debugging consider passing CUDA_LAUNCH_BLOCKING=1

I read on Stackoverflow that the problem may be with the numbering of the classes, but I’m not sure about it.

Could you post a minimal, executable code snippet as well as the output of python -m torch.utils.collect_env so that we could debug the issue, please?

I am currently running the code on Kaggle.
The code is public and can be found here:
https://www.kaggle.com/cortomalt/multiclassseg

Sorry, I’m not able to see any script besides an empty window showing “Version 0 of 0”.

Sorry, I forgot to run the code once the notebook was made public.
Now it should work

I was still getting the error described above, so I tried to run it only using the CPU. Now I have this error:

/opt/conda/lib/python3.7/site-packages/torch/nn/functional.py in cross_entropy(input, target, weight, size_average, ignore_index, reduce, reduction)
   2822     if size_average is not None or reduce is not None:
   2823         reduction = _Reduction.legacy_get_string(size_average, reduce)
-> 2824     return torch._C._nn.cross_entropy_loss(input, target, weight, _Reduction.get_enum(reduction), ignore_index)
   2825 
   2826 

IndexError: Target 11 is out of bounds.

I do not understand if it is due to the output’s shape, or to the mapping of the masks

Based on the error message your target contains invalid indices.
nn.CrossEntropyLoss expects a model output in the shape [batch_size, nb_classes] containing logits and a target in the shape [batch_size] containing class indices in the range [0, nb_classes-1].
The target contains a class index 11 which would mean that the model output should have at least the shape [batch_size, >=12] and which doesn’t seem to be the case.

1 Like

I do not know is the case to open a new topic.
Now the error is::

/tmp/ipykernel_39/2752247508.py in __getitem__(self, idx)
     35 
     36         if self.transforms:
---> 37             aug = self.transforms(image=image, mask=mask)
     38             image = T.ToPILImage()(aug['image'])
     39             mask = T.ToPILImage()(aug['mask'])

TypeError: __call__() got an unexpected keyword argument 'image'

Could it be a problem with the data format required by Albumentation?

Check how self.transforms is defined and what its input arguments are, as the image argument seems to be unexpected.

1 Like

Sono ancora bloccato allo stesso punto
self.transforms è definito così:

class MyDataset (Dataset):
    def __init__(self, images, masks, mean, std, transforms=None):
        self.images = images
        self.masks = masks
        self.transforms = transforms
        self.mean = mean
        self.std = std

        if self.transforms:
            aug = self.transforms(image=image, mask=mask)
            image = T.ToPILImage()(aug['image'])
            mask = T.ToPILImage()(aug['mask'])
        else:
            image = T.ToPILImage()(image)
            mask = T.ToPILImage()(mask)

Afterward it is invoked:

t_train = T.Compose([A.HorizontalFlip(p=0.5), A.VerticalFlip(p=0.5)])
t_val = T.Compose([A.HorizontalFlip(p=0.5), A.VerticalFlip(p=0.5)])

training_set = MyDataset(training_features, training_masks, mean, std, t_train)
validation_set = MyDataset(validation_features, validation_masks, mean, std, t_val)

It seems quite simple, I just can not figure out what I am doing wrong.

It seems you might be mixing albumentations with torchvision, so maybe you should use the Compose from albumentations which could accept multiple inputs?

1 Like

I am sorry to have to open the topic again, but I cannot understand what is wrong with the data’s shape.
Also, the number of classes should be limited to the range 0-2 via the “mask_to_class” function, right?