How to work with large training set when dealing with auto-encoders on google colaboratory?

hello, I dont know I f I can post random questions rather than pytorch ones.

I am training an auto-encoder (keras) on google colab. however, I have 25000 input image and 25000 output image. I tried to: 1- copy the large file from google drive to colab each time (takes 5-6 hours). 2- convert the set to numpy array but when normalizing the images, the size get a lot bigger (from 7GB to 24GB for example) and then I can not fit it into the ram memory. 3- I can not zip and unzip my data. So please, if anyone knows how to convert it into numpy array( and normalize it) without having large file(24GB).

The question seems to be rather Keras/TF-specific and I think you would get the best support on e.g. StackOverflow. I’m unfortunately not deeply familiar with Keras and don’t know, how the data loading (lazy loading) can be implemented.