Hi! Im new to PyTorch and Neural Networks as a whole so forgive me if I am wrong in some of my information/understandings (or how I am supposed to make this post). I am currently working with the BRaTs 2020 dataset, and I wanted some help regarding data transformations.
From what I can understand, my masks seem to be one hot encoded:
Data type of mask: <class ‘numpy.ndarray’>
Shape of mask: (240, 240, 3)
Array dtype: uint8
Array max val: 1
Array min val: 0
The library I am using to build my models, segmentation_models.pytorch, expects that masks are not one hot encoded so that it can then do it itself (from what I’ve understood from git issues). So I tried to collapse it all into one channel making each pixel value assign to its class. I then try to sanity check and see if I can reverse it and output each of the masks as an image. Here, I manage to get outputs for only 2/3 masks. I’ve tried various solutions, whether via google or then bashing my head in using ChatGPT, nothing seems to work. Here is a code snippet regarding my data processing of the masks:
mask = mask.transpose((2, 0, 1))
#Setting label 4 to 3 to avoid issues
mask[mask == 4] = 3
Convert mask from one-hot encoding to class indices
mask = np.argmax(mask, axis=0) # Now mask shape is (H, W)
Add channel dimension to mask
mask = np.expand_dims(mask, axis=0) # Now mask shape is (1, H, W)
Convert mask to tensor
mask = torch.tensor(mask, dtype=torch.long)
Please let me know if im making a silly mistake or if you need more code/context, wouldnt let me post images but happy to show the pre and post visualizations I have. I’ve been testing whether the mask exists via this method:
for i in range(1000):
sample_file_path = test_files[i] # Access each file in the list
with h5py.File(sample_file_path, ‘r’) as file:
mask = file[‘mask’][()]
# If mask is one-hot encoded, convert it to class indices
mask = np.argmax(mask, axis=2) # Assuming mask shape is (H, W, C)
unique_labels = np.unique(mask)
print(f"Unique labels in the mask for file {i}:", unique_labels)
which returns variations of [0,1,2] with 0 being background