Speech recognition and generation

Hello experts in DL and Pytorch,
I have multiple mp3 files with a voice of mine (and corresponding txt files)
How is it possible to train a Pytorch model, so it will make a speech-to-text generation of any text with my voice? Please give me hints/tips/ideas.

Thank you in advance.

I’m also having this problem. Did you check out Micheal Phi’s youtube video: I Built a Personal Speech Recognition System for my AI Assistant - YouTube

Comment on me if you figure this out, and if I figure it out, it will be the same for you.