I’m trying to follow the PyTorch tutorials that explain how to work with audio files and devices. In the StreamReader Advanced Usages, the examples provided are for Mac which I don’t have one. I’m using Linux and I’m having a hard time following the examples.
For one thing, the ffmpeg version that works with torchaudio is earlier than 4.4
. And when I install torchvision
(using conda), the ffmpeg version 4.3
is installed. So far, no complaints. But the problem is that for some reason, the installed ffmpeg 4.3
does not recognize any of my devices:
ffmpeg -devices
ffmpeg version 4.3 Copyright (c) 2000-2020 the FFmpeg developers
built with gcc 7.3.0 (crosstool-NG 1.23.0.449-a04d0)
configuration: --prefix=/opt/conda/conda-bld/ffmpeg_1597178665428/_h_env_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placeh --cc=/opt/conda/conda-bld/ffmpeg_1597178665428/_build_env/bin/x86_64-conda_cos6-linux-gnu-cc --disable-doc --disable-openssl --enable-avresample --enable-gnutls --enable-hardcoded-tables --enable-libfreetype --enable-libopenh264 --enable-pic --enable-pthreads --enable-shared --disable-static --enable-version3 --enable-zlib --enable-libmp3lame
libavutil 56. 51.100 / 56. 51.100
libavcodec 58. 91.100 / 58. 91.100
libavformat 58. 45.100 / 58. 45.100
libavdevice 58. 10.100 / 58. 10.100
libavfilter 7. 85.100 / 7. 85.100
libavresample 4. 0. 0 / 4. 0. 0
libswscale 5. 7.100 / 5. 7.100
libswresample 3. 7.100 / 3. 7.100
Devices:
D. = Demuxing supported
.E = Muxing supported
--
DE fbdev Linux framebuffer
D lavfi Libavfilter virtual input device
DE oss OSS (Open Sound System) playback
DE video4linux2,v4l2 Video4Linux2 output device
This is while I have ffmpeg n5.2
installed on my machine which works with all the devices on my machine perfectly fine:
ffmpeg -devices
ffmpeg version n5.1.2 Copyright (c) 2000-2022 the FFmpeg developers
built with gcc 12.2.0 (GCC)
configuration: --prefix=/usr --disable-debug --disable-static --disable-stripping --enable-amf --enable-avisynth --enable-cuda-llvm --enable-lto --enable-fontconfig --enable-gmp --enable-gnutls --enable-gpl --enable-ladspa --enable-libaom --enable-libass --enable-libbluray --enable-libbs2b --enable-libdav1d --enable-libdrm --enable-libfreetype --enable-libfribidi --enable-libgsm --enable-libiec61883 --enable-libjack --enable-libmfx --enable-libmodplug --enable-libmp3lame --enable-libopencore_amrnb --enable-libopencore_amrwb --enable-libopenjpeg --enable-libopus --enable-libpulse --enable-librav1e --enable-librsvg --enable-libsoxr --enable-libspeex --enable-libsrt --enable-libssh --enable-libsvtav1 --enable-libtheora --enable-libv4l2 --enable-libvidstab --enable-libvmaf --enable-libvorbis --enable-libvpx --enable-libwebp --enable-libx264 --enable-libx265 --enable-libxcb --enable-libxml2 --enable-libxvid --enable-libzimg --enable-nvdec --enable-nvenc --enable-opencl --enable-opengl --enable-shared --enable-version3 --enable-vulkan
libavutil 57. 28.100 / 57. 28.100
libavcodec 59. 37.100 / 59. 37.100
libavformat 59. 27.100 / 59. 27.100
libavdevice 59. 7.100 / 59. 7.100
libavfilter 8. 44.100 / 8. 44.100
libswscale 6. 7.100 / 6. 7.100
libswresample 4. 7.100 / 4. 7.100
libpostproc 56. 6.100 / 56. 6.100
Devices:
D. = Demuxing supported
.E = Muxing supported
--
DE alsa ALSA audio output
DE fbdev Linux framebuffer
D iec61883 libiec61883 (new DV1394) A/V input device
D jack JACK Audio Connection Kit
D kmsgrab KMS screen capture
D lavfi Libavfilter virtual input device
E opengl OpenGL output
DE oss OSS (Open Sound System) playback
DE pulse Pulse audio output
E sdl,sdl2 SDL2 output device
DE video4linux2,v4l2 Video4Linux2 output device
D x11grab X11 screen capture, using XCB
E xv XV (XVideo) output device
More precisely, I was hoping to see pulse
in the devices of ffmpeg 4.3
so I can use my microphone for reading live stream of audio. But right now, there’s no way for me to do anything. I even tested with my own ffmpeg n5.2
(I didn’t install torchvision
) but then StreamReader does not recognize ffmpeg
at all:
StreamReader(
src="1",
format="pulse",
)
RuntimeError: StreamReader requires FFmpeg extension which is not available. Please refer to the stacktrace above for how to resolve this.
I appreciate it if someone could point me to some examples on how to use StreamReader
on Linux.
Thanks.