I am creating a Class based on Dataset, in the function of
__getitem__, I read each MRI and get all slices (100 slices from a MRI) into a list. Then the dataset was fitted into the DataLoader:
train_loader = DataLoader(data_train,
The problem is that If I set shuffle=True, the batch data is shuffled based on subject-level, and for example, If the batch_size is 16, it will give me 16 different subjects, repeating 100 times…. Actually, I do not wanna this behavior because the slices were not reallly shuffled… Do you have any ideas to shuffle from the slice level???
I have tried to read only one slice in the function of
__getitem__, but when I train the mode, it is super slow…
Any idea would be appreciated…
you can write a custom Sampler, and shuffle in a more fine-grained way. See the
sampler keyword argument, instead of
shuffle=True. You can see some of the samplers here: https://pytorch.org/docs/stable/data.html#torch.utils.data.Sampler
They are quite simple to implement, so you can implement your custom sampler that will be more aware of your dataset’s slicing that you want.
@smth Actually, I have thought about this solution, but the problem is that the
__len__ of DataLoader and the
of sampler that I created were not equal, I do not see the possibility to handle that situation with the sampler.
I also tried to extract only one slice using
__getitem__ from the whole MRI. The problem for this approach is that the memory exploses at some time during training…
hmmm, how about you set batch_size=1 in the DataLoader, but your custom dataset itself returns a full batch everytime
__getitem__ is called? That way you can carefully choose shuffling and other aspects by yourself in the Dataset.
Hi, I am facing a similar problem at the time. Have you solved this problem ?