Help with understanding image trandofrmation

Hi I want to ask somebody if can explain to me what these two lines of code do with image and how can i take this transofrmation back.
mask_np have shape - (3, 384, 512)

mask_np = numpy.array(masks[group_idx][image_idx]).mean(axis=0).astype(int)
mask_one_hot = numpy.eye(self.classes_count)[mask_np]

It’s a bit hard to certainly explain what this code is doing, as all objects are undefined.
However, based on the used operations my guess is that mask_np would contain class indices, while mask_one_hot represents the one-hot encoded mask as seen here:

# setup
classes_count = 10
nb_samples = 3
# this is most likely containing the class indices
mask_np = np.random.randint(0, classes_count, (nb_samples,))
print(mask_np)
> [5 4 2]

# this is most ikely creating a one-hot encoding for the class indices
mask_one_hot = np.eye(classes_count)[mask_np]
print(mask_one_hot)
> [[0. 0. 0. 0. 0. 1. 0. 0. 0. 0.]
   [0. 0. 0. 0. 1. 0. 0. 0. 0. 0.]
   [0. 0. 1. 0. 0. 0. 0. 0. 0. 0.]]

# argmax returns the class indices back
print(np.argmax(mask_one_hot, axis=1))
> [5 4 2]

I don’t know what masks is containing and how the mean in axis=0 and the transformation to int is creating the class indices though, so would need to see the input array.

1 Like