Creating 3D Dataset/DataLoader with patches

Haha, I should’ve guessed that.
I will run the experiments in some network and let’s hope that it works.

Thanks, appreciate it.

hello banikr,
I have question I read your post and all the conversation. Actually I am making data loader for MRI images collected from ADNI. I loaded a single image from training folder now I want to load all the MRI images in a iteration way and than apply some neural network for classification purposes.
Please help me that how you load your whole MRI data from the directory
I have 900 MRI images in three different folder i.e Alzheimer have three main classes
CN, MCI, AD so I want to load all the data from each folder but how I to do?
Further more I read 1000 post and tutorial but I couldn’t get an idea to implement as I am not much expert in pytorch and 3D data handling.
I am using following IDE and libraires
IDE- Spyder
using Pytorch and tensorflow
python 3.7
Thanks in advance

Hey,
I did not load the whole MRI image to the data loader. The MR images I am using are of size 172x220x156 so it will exceed the memory Cuda cores can load.
For image synthesis, I created patches of 10000 per image and augmented the data. In your case of classification, it should be similar.
Then I am using regression analysis/prediction from MR image which will not work in patch-based training…so I subsampled the image to reduce the number of voxel size per image.
Then the PyTorch data loader should work fine.
Let me know if you need more help.
I would suggest you use Jupyter notebook or Pycharm IDE for coding. I find them easy to use and feasible. Use python 3.6 if possible, not all the libraries support 3.7 yet.
Since it is Pytorch help forum I would ask you to stick to it, eh… :wink:

1 Like

How to make use of the torch.utils.data.Dataset and torch.utils.data.DataLoader on your own data (not just the torchvision.datasets )?

Is there a way to use the inbuilt DataLoaders which they use on TorchVisionDatasets to be used on any dataset?

Yes, that’s possible and you can write your own Dataset implementation and just pass it to a DataLoader.
Have a look at this tutorial for an example.

Thanks banikr for your valuable reply
that my objective that I passed the whole MR image into my network and my network just classify in their respective classes. But now i knows about that no any method which take directly a 3D image as input file and than some processing by CNN or whatever the network is and than classify into their classes, every one use patch wise input into their network. Furthermore, in my project I used all the ADNI data for that I don’t use augmentation but directly processed all my MR images.
yes you’re right that regression analysis will not help in this regard, you have to use neural network for that purpose as suggestion
will now I am very use to in spyder IDE as its on top of most using IDE now days.
if you don’t have issue could please show some of code snippet which from dataloading for guidance.
Thanks for reading such a long reply:innocent::innocent:

Hi baniker How to convert a single or bunch of MRI image having .nii format into patches?
also please guide me how to subsample same image using pytorch?

Hi @ptrblck
I have a question about unfold. I want to extract patches from my dataset. I use medicaltorch libarary to loading data. If i use unfold , It has error, I think when I load data by using dataloader, it doesn’t access to data. what can I do?
Thanks.

ROOT_DIR= “/home/elahe/data/dataset/”
img_list = os.listdir(os.path.join(ROOT_DIR,‘trainnii’))
label_list = os.listdir(os.path.join(ROOT_DIR,‘labelsnii’))
print(img_list[1])
img_list= (i.unfold(2, 32, 32).unfold(1, 32, 32).unfold(0, 32, 32) for i in img_list)
label_list = (i.unfold(2, 32, 32).unfold(1, 32, 32).unfold(0, 32, 32) for i in label_list)

filename_pairs = [(os.path.join(ROOT_DIR,‘trainnii’,x),os.path.join(ROOT_DIR,‘labelsnii’,y)) for x,y in zip(img_list,label_list)]
print(filename_pairs)
train_transform = transforms.Compose([
mt_transforms.Resample(0.25, 0.25),
mt_transforms.ElasticTransform(alpha_range=(40.0, 60.0),
sigma_range=(2.5, 4.0),
p=0.3),
mt_transforms.ToTensor()]
)
train_dataset = mt_datasets.MRI2DSegmentationDataset(filename_pairs,transform=train_transform)
dataloader = DataLoader(train_dataset, batch_size=2,collate_fn=mt_datasets.mt_collate)

What error do you get?
unfold is a method which should be called by or on a tensor. Based on your code snippet it looks like you are calling it on a file path (string).
Load the images, transform them to tensors, and then call unfold on them.

I apply it after transform, It worked. Thanks a lot.
I have another question, Is there any way for pairing label and without label images in 3D, like “MRI2DSegmentationDataset”?
“MRI2DSegmentationDataset” pair images in 2D, but I want to pair in 3D.
Or should i use the patches and transform them to 2D?
can I train my data by using patch in 3D?

I’m not sure what “pairing” means in this context.
If you want to work on a segmentation use case for 3D data, it should work in the same manner as for 2D data (just with an additional dimension).

It means to make a list of tuples in the format (input filename,ground truth filename).
I use “MRI3DSegmentationDataset” for this.
Thanks for your help.

Creating tuples using a filenames shouldn’t depend on the dimension property.
Would using this codebase as a starter work for your use case?

Yes, I use this,I change my code to the following code ,but it has error.
when I load my data and make pair input. I use “MRI3DvolumesegmentationDataset” and “MRI3DsegmentationDataset”.
first , I use “MRI3DvolumesegmentationDataset”, but it has error that Input shape of each dimension should be a multiple of length plus 2 * padding .
I don’t know what I do.
Is MRI3DvolumesegmentationDataset like to make patches?

ROOT_DIR= "/home/elahe/data/dataset/"
img_list = os.listdir(os.path.join(ROOT_DIR,'trainnii'))
label_list = os.listdir(os.path.join(ROOT_DIR,'labelsnii'))
filename_pairs = [(os.path.join(ROOT_DIR,'trainnii',x),os.path.join(ROOT_DIR,'labelsnii',y)) for x,y in zip(img_list,label_list)]
print(filename_pairs)

 
train_transform = transforms.Compose([
        mt_transforms.Resample(0.25, 0.25,0.25),
        mt_transforms.ToTensor()]
)

filename_pairs = mt_datasets.MRI3DSubVolumeSegmentationDataset(filename_pairs, cache=True,
                 transform=train_transform, canonical=False, length=(64,64,64), padding=0)

train_dataset = mt_datasets.MRI3DSegmentationDataset (filename_pairs,cache= True , transform=train_transform ,canonical= False)

Hello banikr,
Can I access to “get_paired_patch_3D” function in your code?
Thanks.

Hi ptrblck, what should I do, if I want to extract overlapping patches, for example image is 25625632, and patch is 323232, with a step size of 4? Do you might have found any example or tutorial?

unfold should work. Have a look at this post for an example.

Hey @Aliktk
There are different ways. Mostly overlapping and non-overlapping method.

def generate_patch_32_3(MR, Mask, cor, sag, axi):
    """
    :param MR: 3D MR volume
    :param Mask: 3D Mask same shape MR volume
    :param cor:
    :param sag:
    :param axi:
    :return: MR patch and corresponding Mask patch with shape[32,32,32] and [16,16,16]
    """
    # cor = 16
    hCor = np.int(cor/4)
    # sag = 64
    hSag = np.int(sag/4)
    # axi = 64
    hAxi = np.int(axi/4)
    qShape = [96, 128, 128]
    c = [0, MR.shape[0] - qShape[0]]
    s = [0, MR.shape[1] - qShape[1]]
    a = [0, MR.shape[2] - qShape[2]]
    nQuad = len(c) * len(s) * len(a)
    nPatch = np.int(nQuad * (qShape[0] / cor) * (qShape[1] / sag) * (qShape[2] / axi))
    # print(nPatch)
    MR_patch = np.zeros([nPatch, cor, sag, axi]).astype(np.float32)
    Mask_patch = np.zeros([nPatch, np.int(cor/2), np.int(sag/2), np.int(axi/2)]).astype(np.int)
    # print
    patch_count = 0
    quad = 0
    for x in c:
        for y in s:
            for z in a:
                MR_quad = MR[x:x + 96, y:y + 128, z:z + 128]
                Mask_quad = Mask[x:x + 96, y:y + 128, z:z + 128]
                quad += 1           
                for k in range(0, MR_quad.shape[0], cor):  # stops when final slice
                    for i in np.arange(0, MR_quad.shape[1], sag):
                        for j in np.arange(0, MR_quad.shape[2], axi):
                            patch = MR_quad[k:k + cor, i:i + sag, j:j + axi]
                            # std_patch.append(np.max(patch))
                            # print(patch.shape, 'here')
                            MR_patch[patch_count, :, :, :] = patch
                            # std_batch.append(np.max(MR_patch))
                            patch = Mask_quad[k+hCor:k + cor-hCor, i+hSag:i + sag-hSag, j+hAxi:j + axi-hAxi]
                            # print('\t', patch.shape, 'here')
                            Mask_patch[patch_count, :, :, :] = patch
                            patch_count += 1
    return MR_patch, Mask_patch, nPatch

Try the function here. You can avoid the Mask variable if you don’t have one. The function works with loaded MR volume as NIfTI(.nii) data. You can use nibabel python library for that.

I have 131 CT volumes. All 131 samples have different numbers of slices. How can I build a dataloader function of different numbers of slices for each volume?

You could use a custom collate_fn to return e.g. a list containing tensors with a variable size, but the main question would be how you are planning to use these different number of sliced in the actual model training.
Assuming you are creating patches and thus are creating a new dimension in the input tensor, how would the model use these inputs for training? One possible approach would be to use a batch size of 1, but that’s often not desired.