How to do data preprocessing for large amount of data?

I’m working with a significant amount of image data. This is my first time working on something like this. I had never worked with such a vast amount of data before. So, how do you conduct data preparation on massive amounts of data?

The common approach would be to lazily load and process the data to keep the needed memory usage low. This would also mean that each transformation is applied on the loaded sample only and not during an offline preprocessing step.
The ImageNet example uses such an approach and might be a good reference.