Let’s say I have an audio in a tensor. Is there a function in torchaudio that I can use to output a frame-level vad mask, i.e. 0 for slience and 1 for sound activity? Thanks!
Let’s say I have an audio in a tensor. Is there a function in torchaudio that I can use to output a frame-level vad mask, i.e. 0 for slience and 1 for sound activity? Thanks!