Losing Mask Data

Hi! Im new to PyTorch and Neural Networks as a whole so forgive me if I am wrong in some of my information/understandings (or how I am supposed to make this post). I am currently working with the BRaTs 2020 dataset, and I wanted some help regarding data transformations.

From what I can understand, my masks seem to be one hot encoded:
Data type of mask: <class ‘numpy.ndarray’>
Shape of mask: (240, 240, 3)
Array dtype: uint8
Array max val: 1
Array min val: 0

The library I am using to build my models, segmentation_models.pytorch, expects that masks are not one hot encoded so that it can then do it itself (from what I’ve understood from git issues). So I tried to collapse it all into one channel making each pixel value assign to its class. I then try to sanity check and see if I can reverse it and output each of the masks as an image. Here, I manage to get outputs for only 2/3 masks. I’ve tried various solutions, whether via google or then bashing my head in using ChatGPT, nothing seems to work. Here is a code snippet regarding my data processing of the masks:

mask = mask.transpose((2, 0, 1))
#Setting label 4 to 3 to avoid issues
mask[mask == 4] = 3

Convert mask from one-hot encoding to class indices

mask = np.argmax(mask, axis=0) # Now mask shape is (H, W)

Add channel dimension to mask

mask = np.expand_dims(mask, axis=0) # Now mask shape is (1, H, W)

Convert mask to tensor

mask = torch.tensor(mask, dtype=torch.long)

Please let me know if im making a silly mistake or if you need more code/context, wouldnt let me post images but happy to show the pre and post visualizations I have. I’ve been testing whether the mask exists via this method:

for i in range(1000):
sample_file_path = test_files[i] # Access each file in the list
with h5py.File(sample_file_path, ‘r’) as file:
mask = file[‘mask’][()]
# If mask is one-hot encoded, convert it to class indices
mask = np.argmax(mask, axis=2) # Assuming mask shape is (H, W, C)
unique_labels = np.unique(mask)
print(f"Unique labels in the mask for file {i}:", unique_labels)

which returns variations of [0,1,2] with 0 being background

Using argmax to create the indices should work and I’m also able to restore the original one-hot mask using this approach:

# create one-hot encoded mask
mask = torch.zeros(3, 240, 240).long()
for i in range(mask.size(1)):
    for j in range(mask.size(2)):
        mask[torch.randint(0, 3, (1,)), i, j] = 1
print(mask.min(), mask.max())
# tensor(0) tensor(1)

indices = mask.argmax(dim=0)
print(indices.shape)
# torch.Size([240, 240])
print(indices.min(), indices.max())
# tensor(0) tensor(2)

out = torch.nn.functional.one_hot(indices, num_classes=3)
print(out.shape)
# torch.Size([240, 240, 3])
print(out.min(), out.max())
# tensor(0) tensor(1)

# compare
print((out.permute(2, 0, 1).contiguous() == mask).all())
# tensor(True)

Thats what I’ve read and what I’ve been trying however it seems to be refusing to work. I tried running your code and while it runs, it does not generate 3’s in the class indices for the 3rd mask. Is this because there is no background channel by default? All 3 channels here are a annotated mask. I also tried outputting mask before any transposing and after we restore the original one hot and the outputs look different, though I should mention that:

#Sample image to view
sample_file_path = os.path.join(directory, h5_files[25070])
data = {}
with h5py.File(sample_file_path, ‘r’) as file:
for key in file.keys():
data[key] = file[key][()]

#Transpose the mask to have channels first

mask = data[‘mask’].transpose(2, 0, 1)
print(mask.shape)
#(3, 240, 240)
#mask[mask == 4] = 3 #Did not affect results, Annotations comprise the GD-enhancing tumor (ET — label 4), the peritumoral edema
#(ED — label 2), and the necrotic and non-enhancing tumor core (NCR/NET — label 1)
mask = torch.tensor(mask, dtype=torch.long)
print(mask.min(), mask.max())
#tensor(0) tensor(1)

indices = mask.argmax(dim=0)
print(indices.shape)
#torch.Size([240, 240])
print(indices.min(), indices.max())
#tensor(0) tensor(2)
#Set print options to display the full array
torch.set_printoptions(profile=“full”)
print(mask) #Did command + f and search for [[, found 3 different channels, channel 3 contains pixels aka 1’s
print(indices) #Contains 0 (background) 1 2, but no 3’s for third mask
torch.set_printoptions(profile=“default”)
out = torch.nn.functional.one_hot(indices, num_classes=3)
print(out.shape)
#torch.Size([240, 240, 3])
print(out.min(), out.max())
#tensor(0) tensor(1)

#compare
print((out.permute(2, 0, 1).contiguous() == mask).all())
#tensor(False)

Follow up post, I’ve managed to solve it by doing something strange. My idea that maybe it was because there was no background class was correct (I think), however initially I tried to fix that by simply concatting a empty channel onto the mask. However I needed to make it the first channel, and now I seem to be getting my tensor max after argmax to be 3. I also ran the rest of the code and was able to vizualize all 3 masks from the indices (not converting it from one hot yet, not sure if ill need to after training my model). Anyways, any ideas on why this was the case? Why does adding a empty channel fix the issue?

Your codes are unfortunately not properly formatted so hard to read. If your tensor uses 3 channels, the corresponding indices will be [0, 1, 2] since Python is 0-index based. If you expect a class index of 3 you would need to work with 4 channels.