I’m wondering if it could reduce training time by storing learning data (e.g. tenthousands of images) in a big tensor instead of loading them one by one with a dataloader.
Can this only speed up the initialization process because the data - once loaded - is stored in the memory and in tensor format anyways? How do we differentiate between datasets so small that they should stay in memory and so big, that they have to be partially loaded for every epoch?
I don’t care about a faster initialization-phase but can we improve learning time in any way with? If tenthousands (or more) of image have to be decoded (from .jpg or .png or alike) every epoch, this seems like an easy optimization to do once.
For the case that the images are loaded again and again for every epoch and should not be stored as a tensor (for whatever reason): Which image format would be optimal? Are there mentionably less decoding steps for a .tiff, .jpg., .png etc., or does it not matter, because they are just loaded once?