Loading all MRI slices of a patient to train on model

rtan · April 4, 2021, 7:06am

Hello!

I am quite new to using PyTorch, and I’m currently researching ways to create a custom dataset for my issue. I currently have around 287 patients, and I was able to preprocess the DICOM files to create images and mask files for all of them.

The problem I’m currently facing is as follows: I would like to train my UNET model on all the slices for each patient that’s in the train group. I currently a root directory called data, which consists of two folders, called images and masks respectively. Both folders contain 287 patient folders and within each patient folder, includes the same n amount of MRI slices. The task I’m currently trying to work on is semantic segmentation using UNET, and from what I can see - ImageFolder only works for image classification.

What would be the best way to create a custom dataset for this, and how would I load the data to my UNET model? Please let me know! Thank you in advance!

ptrblck · April 6, 2021, 7:09am

Your approach of using a custom Dataset sounds correct.
To implement it, you could have a look at this tutorial first and adapt it so that it fits your use case.
In particular, you would store the image/patient input images and the corresponding mask paths in the __init__ method and would then lazily load each pair in the __getitem__.
The critical step would be to make sure the paths for the DICOM images and masks are sorted correctly, so that indexing these lists of paths yields the desired corresponding pairs and doesn’t mix patients etc.
Also, you could take a look at e.g. MONAI, which might already provide some Dataset implementations for similar use cases.