I have an audio file containing too much silent segment, and i want to filter the silent segments with torchaudio.functional.vad()
. But the funciton can only trim the front silent part of audio, it still remains silent segment in the mid and back. Can someone tell me why? Thanks!
Besides, I want to know the meaning of each parameter in vad()
. Beacuse I try to split the audio into many segments with shorter duration, like 0.03s, process each with vad, and concat the process segment together finally, but i don’t know how to set the parameters.
Thanks a lot !