Torchaudio VAD reverse effect

aNR0 · March 15, 2022, 7:59pm

from Torchaudio documentation, regarding the Voice Activity Detector transform (torchaudio.transforms — Torchaudio 0.11.0 documentation), it says:

“The effect can trim only from the front of the audio, so in order to trim from the back, the reverse effect must also be used.”

Well, I have a simple question: how can I apply the reverse effect?

nateanl · March 30, 2022, 7:11pm

Hi @aNR0, you can try torchaudio.sox_effects.apply_effects_tensor with reverse as the effects.
For example,

effects = [
    ['reverse'],  # reverse
]
sample_rate = 16000
waveform = 2 * torch.rand([2, sample_rate * 1]) - 1
waveform, sample_rate = apply_effects_tensor(waveform, sample_rate, effects, channels_first=True)

Shahin_Konadath · August 17, 2023, 12:59pm

Try this!

import torchaudio
import torchaudio.functional as taf

def trim_silence(source: str, target: str = None, sample_rate: int = 22050, resample: bool = True):
    if target is None:
        target = source.replace(".wav", "_trim.wav")
    try:
        aud, sr = torchaudio.load(source)
        if aud.shape[0] > 1:
            aud = aud.mean(aud, dim=0, keepdim=True)

        if resample:
            aud = taf.resample(aud, sr, sample_rate)

        aud = taf.vad(aud, sample_rate)
        aud, sr = torchaudio.sox_effects.apply_effects_tensor(aud, sample_rate, [['reverse']])
        aud = taf.vad(aud, sample_rate)
        aud, sr = torchaudio.sox_effects.apply_effects_tensor(aud, sample_rate, [['reverse']])
        torchaudio.save(target, aud, sample_rate, bits_per_sample=32)
        return target
    except Exception as et:
        raise et