This script is taking too much memory

kapilpause · February 10, 2019, 11:05am

trans=transforms.ToTensor()
def convert_to_tensor(X,y,name_of_fish,dic):
    images=glob.glob("fishes/"+ name_of_fish + "/*jpg")
    for image in images:
        image=Image.open(image).resize((720,1280),Image.ANTIALIAS)
        x=trans(image)
        image.close()
        X+=[x]
        y+=[dic[name_of_fish]]
dic={"ALB":1,"BET":2,"DOL":3,"LAG":4,"Nof":5,"OTHER":6,"SHARK":7,"YFT":8}
X=[]

y=[]
convert_to_tensor(X,y,"ALB",dic)

Running this script gives me error that not enough memory is available and slows down my computer a lot. Basically i want to convert all images to tensor so that i can apply CNN.
Whats the problem?

albanD · February 11, 2019, 10:03am

Hi,

The problem is that this script will load ALL your images into your computer main memory. If you have many of them, you may not have enough RAM to do so. For example Imagenet which is 150+GB will most likely not fit in memory !

kapilpause · February 11, 2019, 12:23pm

I suspected that…thats why i included image.close()…but for some reason it doesnt work?
Anyway…how should i resolve the issue?

albanD · February 11, 2019, 12:43pm

close will just close the object to read the file, but X will still hold all your images in memory.
To solve this, you can use a similar approach to what is done for imagenet here where images are loaded on the disk only when needed then deleted.

albanD · February 11, 2019, 4:28pm

To be more specific, it uses the ImageFolder Dataset from torchvision.
This special dataset is such that when a given element is required, it loads it from disk then return it (with some tricks to make this fast).
You will need your dataset to be layed out on disk as expected by the dataset (see the doc for what it expects).

kapilpause · February 12, 2019, 9:24am

But this issue again pops up (cuda error:out of memory) when i train the model. I can provide the details if you want.

albanD · February 12, 2019, 1:23pm

This is a different issue, now your data loading is not eating up all the RAM of your computer.
Now if the GPU is running out of memory as well, then the first thing to do is reduce the batch size to see when it fits on the gpu. If you have a GPU with a small amount of memory, you might only be able to use small models.