I made a neural network with the urban sound dataset according to a tutorial, but now I wanna create my own dataset and network, which will recognize a wake word for sound assistant (hopefully). where should I start?
There is a voice activity detection example in torchaudio. I used that for inspiration when doing my raspberry lightbulb control.
Is it this one?
I had been thinking of audio/vad.py at main · pytorch/audio · GitHub