Hello all,
I’m experimenting with the Vulkan backend, hence my GPU is currently not supported by ROCm. I built a custom Version of torch 2.9.0-rc2 and vision 0.24.0 with following build configuration:
#!/usr/bin/env bash
export CMAKE_BUILD_TYPE=DEBUG
export BUILD_LIBTORCH_CPU_WITH_DEBUG=1
export BUILD_FUNCTORCH=1
export USE_VULKAN=1
export USE_VULKAN_FP16_INFERENCE=1
export USE_VULKAN_RELAXED_PRECISION=1
export USE_VULKAN_SHADERC_RUNTIME=1
export USE_OPENMP=1
export USE_CUDA=0
export USE_ROCM=0
export USE_SYSTEM_CPUINFO=1
export USE_SYSTEM_FP16=1
export USE_SYSTEM_PYBIND11=1
export USE_SYSTEM_XNNPACK=1
export USE_SYSTEM_FXDIV=1
export USE_SYSTEM_SLEEF=1
export USE_SYSTEM_PTHREADPOOL=1
export MAX_JOBS=8
export CMAKE_GENERATOR="Ninja"
# for vision
export WITH_CUDA=0
export WITH_MPS=0
export WITH_PNG=1
export WITH_JPEG=1
export WITH_WEBP=1
export WITH_AVIF=1
The plain torch module is working fine, unlike the syntax inconsistency when using the vulkan backend and lacking methods (like device_count()). But when I’m simply importing torchvision a segmentation fault occurs:
nv) jds@padua:~/test/vtt$ python minimal_bug.py
[padua:248026:0:248026] Caught signal 11 (Segmentation fault: address not mapped to object at address (nil))
==== backtrace (tid: 248026) ====
0 /lib/x86_64-linux-gnu/libucs.so.0(ucs_handle_error+0x2bc) [0x7f50eba3064c]
1 /lib/x86_64-linux-gnu/libucs.so.0(+0x3182f) [0x7f50eba3082f]
2 /lib/x86_64-linux-gnu/libucs.so.0(+0x319fa) [0x7f50eba309fa]
3 /lib/x86_64-linux-gnu/libc.so.6(+0x3fdf0) [0x7f50ec677df0]
4 /lib/x86_64-linux-gnu/libc.so.6(+0x1643c9) [0x7f50ec79c3c9]
5 /home/jds/test/vtt/env/lib/python3.13/site-packages/torch/lib/libtorch_python.so(_ZNSt7__cxx1112basic_stringIcSt11char_traitsIcESaIcEE12_M_constructIPcEEvT_S7_St20forward_iterator_tag+0xca) [0x7f50e27b0e5e]
6 /home/jds/test/vtt/env/lib/python3.13/site-packages/torch/lib/libtorch_cpu.so(_ZN3c104impl13OperatorEntry14registerKernelERKNS_10DispatcherESt8optionalINS_11DispatchKeyEENS_14KernelFunctionES5_INS0_12CppSignatureEESt10unique_ptrINS_14FunctionSchemaESt14default_deleteISC_EENSt7__cxx1112basic_stringIcSt11char_traitsIcESaIcEEE+0x378) [0x7f50ca3997a2]
7 /home/jds/test/vtt/env/lib/python3.13/site-packages/torch/lib/libtorch_cpu.so(_ZN3c1010Dispatcher12registerImplENS_12OperatorNameESt8optionalINS_11DispatchKeyEENS_14KernelFunctionES2_INS_4impl12CppSignatureEESt10unique_ptrINS_14FunctionSchemaESt14default_deleteISA_EENSt7__cxx1112basic_stringIcSt11char_traitsIcESaIcEEE+0x141) [0x7f50ca385203]
8 /home/jds/test/vtt/env/lib/python3.13/site-packages/torch/lib/libtorch_cpu.so(_ZNR5torch7Library5_implEPKcONS_11CppFunctionENS_17_RegisterOrVerifyE+0x413) [0x7f50ca429c7b]
9 /home/jds/test/vtt/env/lib/python3.13/site-packages/torchvision/_C.so(+0x248de) [0x7f503e6288de]
10 /lib64/ld-linux-x86-64.so.2(+0x4fae) [0x7f50ec9acfae]
11 /lib64/ld-linux-x86-64.so.2(+0x507c) [0x7f50ec9ad07c]
12 /lib64/ld-linux-x86-64.so.2(_dl_catch_exception+0x106) [0x7f50ec9aa426]
13 /lib64/ld-linux-x86-64.so.2(+0xba28) [0x7f50ec9b3a28]
14 /lib64/ld-linux-x86-64.so.2(_dl_catch_exception+0x79) [0x7f50ec9aa399]
15 /lib64/ld-linux-x86-64.so.2(+0xbda8) [0x7f50ec9b3da8]
16 /lib/x86_64-linux-gnu/libc.so.6(+0x8f258) [0x7f50ec6c7258]
17 /lib64/ld-linux-x86-64.so.2(_dl_catch_exception+0x79) [0x7f50ec9aa399]
18 /lib64/ld-linux-x86-64.so.2(+0x24bf) [0x7f50ec9aa4bf]
19 /lib/x86_64-linux-gnu/libc.so.6(+0x8ed67) [0x7f50ec6c6d67]
20 /lib/x86_64-linux-gnu/libc.so.6(dlopen+0x69) [0x7f50ec6c7309]
21 /usr/lib/python3.13/lib-dynload/_ctypes.cpython-313-x86_64-linux-gnu.so(+0x17dfd) [0x7f50ec58ddfd]
22 python() [0x588a20]
23 python(_PyObject_MakeTpCall+0x27b) [0x54726b]
24 python(_PyEval_EvalFrameDefault+0xfce) [0x56270e]
25 python() [0x5a090c]
26 python(_PyObject_MakeTpCall+0x1d5) [0x5471c5]
27 python(_PyEval_EvalFrameDefault+0x2661) [0x563da1]
28 python(PyEval_EvalCode+0xcc) [0x55d48c]
29 python() [0x5da802]
30 python() [0x57e1cd]
31 python(_PyEval_EvalFrameDefault+0x4008) [0x565748]
32 python() [0x587b36]
33 python(PyObject_CallMethodObjArgs+0xe2) [0x5cc632]
34 python(PyImport_ImportModuleLevelObject+0x23b) [0x5cb42b]
35 python(_PyEval_EvalFrameDefault+0x577e) [0x566ebe]
36 python(PyEval_EvalCode+0xcc) [0x55d48c]
37 python() [0x5da802]
38 python() [0x57e1cd]
39 python(_PyEval_EvalFrameDefault+0x4008) [0x565748]
40 python() [0x587b36]
41 python(PyObject_CallMethodObjArgs+0xe2) [0x5cc632]
42 python(PyImport_ImportModuleLevelObject+0x23b) [0x5cb42b]
43 python(_PyEval_EvalFrameDefault+0x577e) [0x566ebe]
44 python(PyEval_EvalCode+0xcc) [0x55d48c]
45 python() [0x6ab8d1]
46 python() [0x6a899c]
47 python() [0x6b9943]
48 python() [0x6b93e3]
49 python() [0x6b921e]
50 python(Py_RunMain+0x3c1) [0x6b86f1]
51 python(Py_BytesMain+0x2b) [0x6838eb]
52 /lib/x86_64-linux-gnu/libc.so.6(+0x29ca8) [0x7f50ec661ca8]
53 /lib/x86_64-linux-gnu/libc.so.6(__libc_start_main+0x85) [0x7f50ec661d65]
54 python(_start+0x21) [0x682c81]
=================================
Segmentation fault
Zhe minimal script is:
#!/usr/bin/env python
import torch
import torchvision
Two things make wonder:
- Why torchvision depends on a video-broadcasting-library (libucs)?
- What causes the trouble?
My system in bullets:
- Debian 13 (Trixie)
- GCC 14.2.0
- Python 3.13
- libc 2.41
- libstdc++ 12.4.5
The post is quite long, if further information is needed I’ll provide it later.
Any suggestions, what went wrong?
Greets,
Jonathan