Custom color mapping in data loader for UNET image segmentation

Hi,

I have a 1000x1000 image and a mask of the same size with 5 color-coded classes (blue, red, dark green, light green, and pink).

I extracted this snippet of my code onto Jupyter notebook for debugging.

As we see below, the image shape is (1000, 1000, 3) and same thing for the mask shape. Yet len(colors) is 3809 instead of 5. Therefore my color map has 3809 key-value pairs instead of 5. Any reason why this might be happening?

Thank you!

Could you print (some) values of the unique colors and check, if they might be noisy e.g. due to a resizing operation?

PS: It’s better to post code snippets by wrapping them into three backticks ```, as is makes debugging easier and also allows the search to index the code.

Hi, my apologies for the late response.

I think you are exactly right. What I did was I added the masks using GIMP, then I scaled down the images also using GIMP. I am going to try reversing the order and will let you know if that fixes things. Thank you.

Unfortunately resizing first and then adding in the colored masks did not change anything (unless I did something wrong in these steps).

I printed the list of unique colors and the length and this is what I get:

print(colors)
[[  7. 133.   0.]
 [  8. 132.   0.]
 [  8. 132.   2.]
 ...
 [255. 254. 255.]
 [255. 255. 253.]
 [255. 255. 255.]]

len(colors) = 3809 but that is strange because it should be 5 (or maybe 6 if it counts the black background).

Based on the output it seems that some colors (e.g. white, which is given as [255, 255, 255]) have some “neightbors” with a small difference (e.g. [255, 254, 255]), which could be introduced by an interpolation method in e.g. resizing.
If you are resizing the image, I would recommend to use the nearest neighbor interpolation method, as it would keep the pixel values without adding new unique ones.
In case you cannot isolate the operation, which is adding these values, you could try to cluster the values afterwards into the desired 6 classes.

So I did this test with a 1920x1080 image consisting of my 5 masks and white background. When I inputted this image into the program above, the colors array consists of the following

[[ 10. 133.   1.]
 [ 14.   1. 133.]
 [ 32. 255.   0.]
 [182. 182. 202.]
 [182. 202. 182.]
 [183. 255. 182.]
 [187. 206. 187.]
 [188. 187. 206.]
 [188. 188. 206.]
 [188. 206. 188.]
 [188. 255. 187.]
 [189. 255. 188.]
 [193. 193. 210.]
 [193. 210. 193.]
 [194. 255. 193.]
 [243.   5. 247.]
 [249. 182. 251.]
 [249. 187. 251.]
 [249. 188. 251.]
 [249. 193. 251.]
 [255.   0.   0.]
 [255. 182. 182.]
 [255. 187. 187.]
 [255. 188. 188.]
 [255. 193. 193.]
 [255. 255. 255.]]

With len(colors) = 26. I did not resize the image so I am quite confused why there are 26 unique colors as opposed to 6. Perhaps I am doing something wrong with my preprocessing but I am lost with regards to that.

It seems the image itself contains these additional values and you can check the first 16 unique values via:

img = PIL.Image.open("./color_test.png")
a = np.array(img)
u = np.unique(a)

f, axarr = plt.subplots(4, 4)
for idx, ax in enumerate(axarr.reshape(-1)):
    if idx == 16:
        break
    ax.imshow(a == u[idx], cmap='gray')

Which shows the expected blobs as well as the (unwanted) borders:
image

so you would have to make sure to either filter them after loading the image or during the creation of it.

1 Like

Thank you very much! I really appreciate it.

I realized that the tool I was using on GIMP to create masks added additional pixel values to the edges of the fills.

Hi, do you have any recommended software or editing tools I can use to mask my data? The tools that I use on GIMP always seem to introduce additional pixel values to my masks and it interferes with the unique-color-picking code. Thank you very much.

I guess the additional pixels might come from an interpolation step and/or quality reduction during the saving/export of the annotated mask. Which format are you currently using and are you able to define some kind of quality level while storing the mask or use a lossless format?

I export my masks as a JPG with the following settings on GIMP.

So for example, with the following mask saved as a JPG, the Python program above reads 85 unique color values as opposed to 2 (green and white).

Figured it out! The fill tool I was using on GIMP had anti-aliasing which added additional pixel values to the mask boundary in order to smooth the edges. Thank you.