I worked on speech enhancement with VCTK database.
I want to load data and apply the pre-processing method simultaneously and efficiently. pre-processing is performed in just one python function.
My problem is, when I use the dataloader, it load just one wave file per loading. And it return the different amount of training data after pre-processing because the wave files which have different time length are chopped into input size in pre-processing. It means for every iteration, network will be trained with small and different batch size, and it is time consuming.
So I want to loading and pre-processing simultaneously for training with same batch size ( It should be stacked for several wav file). Now, I saved all preprocessed data as npy file. Is there more efficient way for loading data.