Integrate Kaldi with pytorch

Mohamed_Nabih · January 23, 2020, 1:06pm

Dear All;

I am running Kaldi ASR toolkit and fit MFCC features from speech Dataset and stored it in .ark, .scp and CMVN files, so how I can train my Network based on these files

Thanks

ptrblck · January 24, 2020, 6:03am

You could use a library like kaldiio to load these samples and create a custom Dataset and pass it to a DataLoader as explained in this tutorial.

Once you have the Dataset ready, you could continue working with the architecture or your model.
I’m not completely sure how the data is stored, but since you are dealing with MFCC data, I assume you could treat it as “image” data?

vincentqb · January 27, 2020, 11:21pm

torchaudio supports ark and scp, offers MFCC, and also has a template for datasets with DataLoader