I’ve understood the process of labeling for semantic segmentation for 2D images. I was able to create label or target tensors using a colour coded method provided for the dataset. The colour codes provided were:

```
("Animal", np.array([64, 128, 64], dtype=np.uint8)),
("Archway", np.array([192, 0, 128], dtype=np.uint8)),
("Bicyclist", np.array([0, 128, 192], dtype=np.uint8)),
("Bridge", np.array([0, 128, 64], dtype=np.uint8)),
("Building", np.array([128, 0, 0], dtype=np.uint8)),
("Car", np.array([64, 0, 128], dtype=np.uint8)),
("CartLuggagePram", np.array([64, 0, 192], dtype=np.uint8)),
("Child", np.array([192, 128, 64], dtype=np.uint8)),
```

To create a label for my network I would simply locate these RGB colours in the images and compare each “object” with its respective colour code and then give it a value in a grayscale image. So each label would be a single channel tensor. And the output of my system would have channels equal to the number of classes in the dataset (for CamVid this was 32).

No I am trying to do the same in 3D and am struggling to understand what to do. The problem is a binary one as I have to detect a single item in the input so basically I should have two output channels: Background and foreground. I have a 3D single channel input, the input having the shape: `[BS, Channel, Z, X, Y]`

.

For this input, I have a tensor which indicates the locations of my object of interest in the input. A 3D Gaussian is built around each point to be measured.

I hope I have explained my situation clearly.

My confusion is how to create a label tensor in 3D for segmenting my points of interest as I have done for the CamVid dataset.

Below is a 2D representation of a target that has 3D Gaussians drawn on the points of interest, the tensor is of the size `[16, 64, 64]`

Here is the target file as a pickle dump (just in case): https://app.box.com/s/movmlt377991ptrpd0kug0d3d8s85q9l

Similar to the CamVid case as I would have 1 channel per class, here I know that I would have 2 channels, one for the background and 1 for the foreground, but how to label these 3D tensors is what I dont understand.