My dataloader is extremely slow on first epoch. I have tried to go through all the solutions and threads posted online and almost nothing helps. These are the things I did:
Tested on SSD which also has an NVMe support. The data transfer rate is extremely fast.
Turned off pin_memory, doesn’t help.
Moved from reading one HDF5 file to individual hdf5 files.
Removed zipping hdf5 data.
My model is very small for practical purposes and the batch size is also small. But as the amount of data increases the dataloader gets slower and slower on the first epoch. Is there any solution to this?
I think I have a temporary solution for now. I have split my dataloader into train and test. In train I don’t send the string tuples I was sending before. I only send the image and it’s associated label during training. It kind of solves the issue temporarily, I’ll have to run big tests again. I’ll post if the issue becomes an issue again.