Get runtime error when using filter in torchaudio: `RuntimeError: Failed to create input filter: "time_base=1/16000:sample_rate=16000:sample_fmt=flt:channel_layout=0x0" (Invalid argument)`

hxt18258480551 · December 19, 2023, 7:44am

I’m trying audio fitering in torchaudio. The codes below is almost copied from the doc of torchaudio (Audio Data Augmentation — Torchaudio 2.1.2 documentation):

import torch
import torchaudio

# Define 2-channel gaussian noise for a second
waveform1 = torch.randn(2, 16000)
sample_rate = 16000

# Define effects
effect = ",".join(
    [
        "lowpass=frequency=300:poles=1",  # apply single-pole lowpass filter
        "atempo=0.8",  # reduce the speed
        "aecho=in_gain=0.8:out_gain=0.9:delays=200:decays=0.3|delays=400:decays=0.3"
        # Applying echo gives some dramatic feeling
    ],
)


# Apply effects
def apply_effect(waveform, sample_rate, effect):
    effector = torchaudio.io.AudioEffector(effect=effect)
    return effector.apply(waveform, sample_rate)


waveform2 = apply_effect(waveform1, sample_rate, effect)

print(waveform1.shape, sample_rate)
print(waveform2.shape, sample_rate)

But the runtime error from ffmpeg occurs:

Traceback (most recent call last):
  File "/home/hxt/ASR_adversarial_examples_new/torchaudio_filter_test.py", line 25, in <module>
    waveform2 = apply_effect(waveform1, sample_rate, effect)
  File "/home/hxt/ASR_adversarial_examples_new/torchaudio_filter_test.py", line 22, in apply_effect
    return effector.apply(waveform, sample_rate)
  File "/home/hxt/ASR_adversarial_examples_new/venv/lib/python3.8/site-packages/torchaudio/io/_effector.py", line 313, in apply
    reader = self._get_reader(waveform, sample_rate, output_sample_rate)
  File "/home/hxt/ASR_adversarial_examples_new/venv/lib/python3.8/site-packages/torchaudio/io/_effector.py", line 275, in _get_reader
    src = _encode(waveform, sample_rate, self.effect, muxer, encoder, self.codec_config)
  File "/home/hxt/ASR_adversarial_examples_new/venv/lib/python3.8/site-packages/torchaudio/io/_effector.py", line 99, in _encode
    writer.add_audio_stream(
  File "/home/hxt/ASR_adversarial_examples_new/venv/lib/python3.8/site-packages/torchaudio/io/_stream_writer.py", line 275, in add_audio_stream
    self._s.add_audio_stream(
RuntimeError: Failed to create input filter: "time_base=1/16000:sample_rate=16000:sample_fmt=flt:channel_layout=0x0" (Invalid argument)
Exception raised from add_src at /__w/audio/audio/pytorch/audio/torchaudio/csrc/ffmpeg/filter_graph.cpp:91 (most recent call first):
frame #0: c10::Error::Error(c10::SourceLocation, std::string) + 0x57 (0x7f8a65239617 in /home/hxt/ASR_adversarial_examples_new/venv/lib/python3.8/site-packages/torch/lib/libc10.so)
frame #1: c10::detail::torchCheckFail(char const*, char const*, unsigned int, std::string const&) + 0x64 (0x7f8a651f498d in /home/hxt/ASR_adversarial_examples_new/venv/lib/python3.8/site-packages/torch/lib/libc10.so)
frame #2: torchaudio::io::FilterGraph::add_src(AVFilter const*, std::string const&) + 0x37f (0x7f89e885e8ff in /home/hxt/ASR_adversarial_examples_new/venv/lib/python3.8/site-packages/torchaudio/lib/libtorchaudio_ffmpeg4.so)
frame #3: torchaudio::io::FilterGraph::add_audio_src(AVSampleFormat, AVRational, int, unsigned long) + 0x31 (0x7f89e885ea01 in /home/hxt/ASR_adversarial_examples_new/venv/lib/python3.8/site-packages/torchaudio/lib/libtorchaudio_ffmpeg4.so)
frame #4: torchaudio::io::get_audio_encode_process(AVFormatContext*, int, int, std::string const&, c10::optional<std::string> const&, c10::optional<std::map<std::string, std::string, std::less<std::string>, std::allocator<std::pair<std::string const, std::string> > > > const&, c10::optional<std::string> const&, c10::optional<int> const&, c10::optional<int> const&, c10::optional<torchaudio::io::CodecConfig> const&, c10::optional<std::string> const&, bool) + 0x777 (0x7f89e888bf27 in /home/hxt/ASR_adversarial_examples_new/venv/lib/python3.8/site-packages/torchaudio/lib/libtorchaudio_ffmpeg4.so)
frame #5: torchaudio::io::StreamWriter::add_audio_stream(int, int, std::string const&, c10::optional<std::string> const&, c10::optional<std::map<std::string, std::string, std::less<std::string>, std::allocator<std::pair<std::string const, std::string> > > > const&, c10::optional<std::string> const&, c10::optional<int> const&, c10::optional<int> const&, c10::optional<torchaudio::io::CodecConfig> const&, c10::optional<std::string> const&) + 0x90 (0x7f89e8893040 in /home/hxt/ASR_adversarial_examples_new/venv/lib/python3.8/site-packages/torchaudio/lib/libtorchaudio_ffmpeg4.so)
frame #6: <unknown function> + 0x18440 (0x7f89cb7ea440 in /home/hxt/ASR_adversarial_examples_new/venv/lib/python3.8/site-packages/torchaudio/lib/_torchaudio_ffmpeg4.so)
frame #7: <unknown function> + 0x2dce5 (0x7f89cb7ffce5 in /home/hxt/ASR_adversarial_examples_new/venv/lib/python3.8/site-packages/torchaudio/lib/_torchaudio_ffmpeg4.so)
frame #8: PyCFunction_Call + 0x10a (0x55f074ef05da in /home/hxt/ASR_adversarial_examples_new/venv/bin/python)
frame #9: _PyObject_MakeTpCall + 0x91 (0x55f074eecdb1 in /home/hxt/ASR_adversarial_examples_new/venv/bin/python)
frame #10: <unknown function> + 0x23ba7d (0x55f0750b1a7d in /home/hxt/ASR_adversarial_examples_new/venv/bin/python)
frame #11: _PyEval_EvalFrameDefault + 0x7849 (0x55f074edc629 in /home/hxt/ASR_adversarial_examples_new/venv/bin/python)
frame #12: _PyEval_EvalCodeWithName + 0xb60 (0x55f074fb1400 in /home/hxt/ASR_adversarial_examples_new/venv/bin/python)
frame #13: _PyFunction_Vectorcall + 0x92 (0x55f074eed302 in /home/hxt/ASR_adversarial_examples_new/venv/bin/python)
frame #14: <unknown function> + 0x23ba25 (0x55f0750b1a25 in /home/hxt/ASR_adversarial_examples_new/venv/bin/python)
frame #15: _PyEval_EvalFrameDefault + 0x5718 (0x55f074eda4f8 in /home/hxt/ASR_adversarial_examples_new/venv/bin/python)
frame #16: <unknown function> + 0x5dc07 (0x55f074ed3c07 in /home/hxt/ASR_adversarial_examples_new/venv/bin/python)
frame #17: _PyEval_EvalFrameDefault + 0x5829 (0x55f074eda609 in /home/hxt/ASR_adversarial_examples_new/venv/bin/python)
frame #18: _PyEval_EvalCodeWithName + 0xb60 (0x55f074fb1400 in /home/hxt/ASR_adversarial_examples_new/venv/bin/python)
frame #19: _PyFunction_Vectorcall + 0x92 (0x55f074eed302 in /home/hxt/ASR_adversarial_examples_new/venv/bin/python)
frame #20: _PyEval_EvalFrameDefault + 0x5762 (0x55f074eda542 in /home/hxt/ASR_adversarial_examples_new/venv/bin/python)
frame #21: _PyEval_EvalCodeWithName + 0xb60 (0x55f074fb1400 in /home/hxt/ASR_adversarial_examples_new/venv/bin/python)
frame #22: _PyFunction_Vectorcall + 0x92 (0x55f074eed302 in /home/hxt/ASR_adversarial_examples_new/venv/bin/python)
frame #23: _PyEval_EvalFrameDefault + 0x5762 (0x55f074eda542 in /home/hxt/ASR_adversarial_examples_new/venv/bin/python)
frame #24: <unknown function> + 0x5dc07 (0x55f074ed3c07 in /home/hxt/ASR_adversarial_examples_new/venv/bin/python)
frame #25: _PyEval_EvalFrameDefault + 0x5829 (0x55f074eda609 in /home/hxt/ASR_adversarial_examples_new/venv/bin/python)
frame #26: _PyEval_EvalCodeWithName + 0xb60 (0x55f074fb1400 in /home/hxt/ASR_adversarial_examples_new/venv/bin/python)
frame #27: PyEval_EvalCode + 0x23 (0x55f074fb14a3 in /home/hxt/ASR_adversarial_examples_new/venv/bin/python)
frame #28: <unknown function> + 0x1838e1 (0x55f074ff98e1 in /home/hxt/ASR_adversarial_examples_new/venv/bin/python)
frame #29: PyRun_SimpleFileExFlags + 0x167 (0x55f074ffdef7 in /home/hxt/ASR_adversarial_examples_new/venv/bin/python)
frame #30: Py_RunMain + 0x7b0 (0x55f074edf440 in /home/hxt/ASR_adversarial_examples_new/venv/bin/python)
frame #31: Py_BytesMain + 0x56 (0x55f074edf976 in /home/hxt/ASR_adversarial_examples_new/venv/bin/python)
frame #32: __libc_start_main + 0xe7 (0x7f8ae84b0c87 in /lib/x86_64-linux-gnu/libc.so.6)
frame #33: _start + 0x2a (0x55f074ede5ea in /home/hxt/ASR_adversarial_examples_new/venv/bin/python)

Is there any problem about the input string of the effect ? Could anyone help me for this issue ? Thank you so much!

The environment information:

Python 3.8.13 (venv)
torch 2.1.2
torchaudio 2.1.2
FFmpeg 6.0

Hactogeek · January 24, 2024, 12:53pm

Hi,

The problem come from the shape.

In the AudioEffector apply method, you need to pass data in shape (L, C).

In your case (16_000, 2).

Have a nice day.

Tony