I made a neural network with the urban sound dataset according to a tutorial, but now I wanna create my own dataset and network, which will recognize a wake word for sound assistant (hopefully). where should I start?
There is a voice activity detection example in torchaudio. I used that for inspiration when doing my raspberry lightbulb control.
Best regards
Thomas
Is it this one?