I have a large custom made dataset of images, larger than my memory, and I don’t now what is the correct approach to store and use for training. Should I save the images as JPG/PNGs? Should I save them in a ZIP file? Or CSV? Any important considerations when implementing the dataloader?
Usually Datasets are much bigger than the memory available in machines. This is why we give the data to the network by batches.\
For a custom dataset, I would advice to save the images in PNG if possible. Are both inputs and labels images?
And to use them you could follow this tutorial that tells you how to use the Pytorch Dataloaders to load your data.