Dataloader for a very large dataset of 2D slices for segmentation

I have a dataset of size [40000, 512, 512] images (type np.int16) and I want to create a Dataloader for segmentation network.
Although I have 200 Nifti images of 512x512x200 images, Since I am creating a 2D network, it would be efficient to shuffle between subject slices of different subjects as well.
I had the following thoughts:
i) Create multiple tensors and load them through dataloader.
ii) Load image every single time and the slice and send the data(slow and useless and throws away a lot of data)
iii) Create individual slices, store them as .npz and then use it (very very space consuming and very useless)

Can someone help me understanding what to do?

I can share thoughts and code for the dataloader that I made.


1 Like

Could you explain the relationship between the initial dataset of [40000, 512, 512] and the 200 Nifti images?
Does each sample contain 200 images? In that case, could you deal with [40000, 512, 512, 200] images?

Oh the initial Dataset is just of [200, 200, 512, 512] but I combined all the slices to make it a tensor of [40000, 512, 512] to have 40000 slices

Hi I am looking for a solution to the same issue. Were you able to get it? If so can you please share it here. Thanks