How to speed up training with h5 dataset and shuffle=True?

Hello, I have 200k (7x256x256) images that are loaded into seven h5 file.
When I train with shuffle=True it takes 4 hours / epoch, and without shuffling 30 min / epoch.

The problem is queering from h5 dataset with large index value is slow. And without shuffling the model overfits spectacularly.

Does anyone know how to speed up shuffle=True training with h5 dataset?