I have a dataset of over 12mil. images of size 180x180 (RGB), approx. 85GB of size. I’m training an image classifier on an AWS p2.8xlarge instance (8GPU, 32CPU, 488GB RAM).
I have a custom
Dataset object, which is basically an
ImageFolder with the exception that I’m trying to load images in advance. I.e. I only change lines 41-42 here: https://github.com/pytorch/vision/blob/master/torchvision/datasets/folder.py like this:
path = os.path.join(root, fname) img = pil_loader(path) item = (img, class_to_idx[target]) images.append(item)
and then remove the loading (line 122) from the
__getitem__ method after.
Yet when I run the classifier, it’s somewhat inefficient. After loading roughly 10% of the dataset, it uses over 100GB of RAM. Any idea on what the problem may be or what should I focus on?
I’m grateful for any ideas. Thank you in advance.