Integrate Kaldi with pytorch

Dear All;

I am running Kaldi ASR toolkit and fit MFCC features from speech Dataset and stored it in .ark, .scp and CMVN files, so how I can train my Network based on these files


You could use a library like kaldiio to load these samples and create a custom Dataset and pass it to a DataLoader as explained in this tutorial.

Once you have the Dataset ready, you could continue working with the architecture or your model.
I’m not completely sure how the data is stored, but since you are dealing with MFCC data, I assume you could treat it as “image” data?

torchaudio supports ark and scp, offers MFCC, and also has a template for datasets with DataLoader :slight_smile:

1 Like