Can someone recommend for a tutorial about speech recognition in pytorch?

I made a neural network with the urban sound dataset according to a tutorial, but now I wanna create my own dataset and network, which will recognize a wake word for sound assistant (hopefully). where should I start?

There is a voice activity detection example in torchaudio. I used that for inspiration when doing my raspberry lightbulb control.

Best regards

Thomas

Is it this one?

I had been thinking of audio/vad.py at main · pytorch/audio · GitHub