I am trying to write a custom dataset and data loader for a large dataset(SIDD-Full). The dataset contains 200 scenes and there are 150 different image captures of each of these 150 scenes, and each of these captures has a noisy and clean version. for more details, please refer to the dataset website here.
In each epoch, I want to sample one index from the 150 different captures for every 200 different instances and load them to the code, and then, I want to extract n
patches from each image with size (32, 32)
. I don’t know how to handle this in __getitem__(self, idx)
function since I want to return instances of size (1, 4, 32, 32)
(the raw images have 4 channels), but if I do patching inside __getitem__()
function, I will have a tensor of size (n, 1, 4, 32, 32)
, and I don’ know how can I handle that when creating batches out of the dataset with torch.utils.data.DataLoader
.
I also have no idea of how to iterate on the patches inside the __getitem__()
function. Can anybody help me with this?