What is the current best practice for loading large image dataset (500GB) into pytorch?
I have tried a lmdb way by using this repo and the loading time improved as compared to the ImageFolder+DataLoader pair.
Given that hard disc space and multiprocessing are factors in consideration.
Thanks.