Training loop takes a long time each epoch using TensorDataset


First of all, my dataset is loaded through a pickle file, where each variable is an np array (they are velocity components). Second, they are normalized and transformed to torch tensors. I’m training a SRGAN with low-res and high-res images btw. The dataset is around 14k images.

dataset_train =, HR_data_train)
trainloader =, batch_size=8,
                                          shuffle=True, num_workers=8, pin_memory=True)

I’m training my data on a NVIDIA Tesla V100, and it should not take 14 min each epoch, where each epoch contain around 1800 batches. I believe there is a bottleneck with slow IO speed, and was wondering if there are some workaround this?
I believe the whole dataset is read each epoch, and I was thinking about maybe creating a custom datasetloader, or put all my tensors into a HDF5 file like

If you are dealing with a data loading bottleneck, I would recommend to read this post which gives a good overview about possible reasons and workarounds.