Segfault when combining libtorch + sdl

I’m not sure if this is a problem with SDL2 or libtorch, but my stack trace seems to suggest its on the libtorch end.

Does anyone have experience using libtorch + sdl?

I am compiling a simple program which initialized SDL, creates a window, destroys it, then quits. Everything compiles fine and it terminates without error. Once I link the program with libtorch, I get a segfault after exiting main as the stack unwinding cleanup process happens. Nothing else from the source course changes other than including the libtorch header. Note that compiling a libtorch program without sdl also yields no errors.

Here is the code, with the only change being that the torch header is included

#include <SDL2/SDL.h>
#include <torch/torch.h>

int main() {
    SDL_Window* win = NULL;
    SDL_Surface* screen_surface = NULL;
    SDL_Surface* image_surface = NULL;

    if (SDL_Init(SDL_INIT_VIDEO)) {
        std::cout << "Error initializing SDL" << std::endl;
        return 1;
    }

    win = SDL_CreateWindow("Test Window", SDL_WINDOWPOS_UNDEFINED, SDL_WINDOWPOS_UNDEFINED, 400, 600, SDL_WINDOW_SHOWN);
    screen_surface = SDL_GetWindowSurface(win);


    // Cleanup
    SDL_DestroyWindow(win);
    win = NULL;
    SDL_Quit();
}

Here is the CMakeLists.txt, with the only change being that torch is being linked:

cmake_minimum_required (VERSION 3.12)

# Set flags
set(CMAKE_CXX_STANDARD 17)
set(CMAKE_CXX_STANDARD_REQUIRED ON)
set(ABSL_PROPAGATE_CXX_STD ON)

project(main)

find_package(Torch REQUIRED)
set(CMAKE_MODULE_PATH ${CMAKE_MODULE_PATH} "${CMAKE_CURRENT_SOURCE_DIR}")
find_package(SDL2 REQUIRED)
find_package(SDL2_image REQUIRED)

add_executable(main main.cpp)

include_directories(${SDL2_INCLUDE_DIRS} ${SDL2_IMAGE_INCLUDE_DIRS})
target_link_libraries(main ${SDL2_LIBRARIES} ${SDL2_IMAGE_LIBRARIES} ${TORCH_LIBRARIES})

In addition to this, there are also the FindSDL2_image.cmake but I won’t include this here.

gdb gives the following:

Thread 1 "main" received signal SIGSEGV, Segmentation fault.
__GI___libc_free (mem=0x1) at malloc.c:3102
3102    malloc.c: No such file or directory.
(gdb) bt
#0  __GI___libc_free (mem=0x1) at malloc.c:3102
#1  0x00007fff4b65db87 in llvm::cl::Option::~Option() () from /usr/local/libtorch/lib/libtorch_cpu.so
#2  0x00007ffef514915e in __cxa_finalize (d=0x7fff5cd1a000) at cxa_finalize.c:83
#3  0x00007fff47d01783 in __do_global_dtors_aux () from /usr/local/libtorch/lib/libtorch_cpu.so
#4  0x00007fffffffe020 in ?? ()
Backtrace stopped: previous frame identical to this frame (corrupt stack?)

Valgrind gives the following:

$ valgrind --leak-check=full ./main
==369036== Memcheck, a memory error detector
==369036== Copyright (C) 2002-2017, and GNU GPL'd, by Julian Seward et al.
==369036== Using Valgrind-3.15.0 and LibVEX; rerun with -h for copyright info
==369036== Command: ./main
==369036==
==369036== Warning: set address range perms: large range [0x59c9d000, 0xf3aa1000) (defined)
==369036== Warning: set address range perms: large range [0x4eaa000, 0x1bea4000) (defined)
==369036== Warning: set address range perms: large range [0xf3aa1000, 0x144c5b000) (defined)
==369036== Invalid free() / delete / delete[] / realloc()
==369036==    at 0x483CA3F: free (in /usr/lib/x86_64-linux-gnu/valgrind/vgpreload_memcheck-amd64-linux.so)
==369036==    by 0x999CB86: llvm::cl::Option::~Option() (in /usr/local/libtorch/lib/libtorch_cpu.so)
==369036==    by 0x1C5AE15D: __cxa_finalize (cxa_finalize.c:83)
==369036==    by 0x6040782: ??? (in /usr/local/libtorch/lib/libtorch_cpu.so)
==369036==    by 0x4011F5A: _dl_fini (dl-fini.c:138)
==369036==    by 0x1C5ADA26: __run_exit_handlers (exit.c:108)
==369036==    by 0x1C5ADBDF: exit (exit.c:139)
==369036==    by 0x1C58B0B9: (below main) (libc-start.c:342)
==369036==  Address 0x1 is not stack'd, malloc'd or (recently) free'd
==369036==
==369036== Invalid free() / delete / delete[] / realloc()
==369036==    at 0x483CA3F: free (in /usr/lib/x86_64-linux-gnu/valgrind/vgpreload_memcheck-amd64-linux.so)
==369036==    by 0x15DC4EDF: llvm::cl::opt<llvm::FunctionSummary::ForceSummaryHotnessType, true, llvm::cl::parser<llvm::FunctionSummary::ForceSummaryHotnessType> >::~opt() (in /usr/local/libtorch/lib/libtorch_cpu.so)
==369036==    by 0x1C5AE15D: __cxa_finalize (cxa_finalize.c:83)
==369036==    by 0x6040782: ??? (in /usr/local/libtorch/lib/libtorch_cpu.so)
==369036==    by 0x4011F5A: _dl_fini (dl-fini.c:138)
==369036==    by 0x1C5ADA26: __run_exit_handlers (exit.c:108)
==369036==    by 0x1C5ADBDF: exit (exit.c:139)
==369036==    by 0x1C58B0B9: (below main) (libc-start.c:342)
==369036==  Address 0x800000003 is not stack'd, malloc'd or (recently) free'd
==369036==
==369036== Invalid free() / delete / delete[] / realloc()
==369036==    at 0x483CA3F: free (in /usr/lib/x86_64-linux-gnu/valgrind/vgpreload_memcheck-amd64-linux.so)
==369036==    by 0x9BBE3B3: llvm::cl::opt<llvm::GVDAGType, false, llvm::cl::parser<llvm::GVDAGType> >::~opt() (in /usr/local/libtorch/lib/libtorch_cpu.so)
==369036==    by 0x1C5AE15D: __cxa_finalize (cxa_finalize.c:83)
==369036==    by 0x6040782: ??? (in /usr/local/libtorch/lib/libtorch_cpu.so)
==369036==    by 0x4011F5A: _dl_fini (dl-fini.c:138)
==369036==    by 0x1C5ADA26: __run_exit_handlers (exit.c:108)
==369036==    by 0x1C5ADBDF: exit (exit.c:139)
==369036==    by 0x1C58B0B9: (below main) (libc-start.c:342)
==369036==  Address 0x800000004 is not stack'd, malloc'd or (recently) free'd
==369036==
==369036== Invalid free() / delete / delete[] / realloc()
==369036==    at 0x483CA3F: free (in /usr/lib/x86_64-linux-gnu/valgrind/vgpreload_memcheck-amd64-linux.so)
==369036==    by 0x9BBE3F1: llvm::cl::opt<llvm::PGOViewCountsType, false, llvm::cl::parser<llvm::PGOViewCountsType> >::~opt() (in /usr/local/libtorch/lib/libtorch_cpu.so)
==369036==    by 0x1C5AE15D: __cxa_finalize (cxa_finalize.c:83)
==369036==    by 0x6040782: ??? (in /usr/local/libtorch/lib/libtorch_cpu.so)
==369036==    by 0x4011F5A: _dl_fini (dl-fini.c:138)
==369036==    by 0x1C5ADA26: __run_exit_handlers (exit.c:108)
==369036==    by 0x1C5ADBDF: exit (exit.c:139)
==369036==    by 0x1C58B0B9: (below main) (libc-start.c:342)
==369036==  Address 0x800000003 is not stack'd, malloc'd or (recently) free'd
==369036==
==369036== Conditional jump or move depends on uninitialised value(s)
==369036==    at 0x8D879FF: torch::jit::deregisterOperator(c10::FunctionSchema const&) (in /usr/local/libtorch/lib/libtorch_cpu.so)
==369036==    by 0x6163A3C: c10::Dispatcher::deregisterDef_(c10::OperatorHandle const&, c10::OperatorName const&) (in /usr/local/libtorch/lib/libtorch_cpu.so)
==369036==    by 0x61A4674: c10::RegisterOperators::~RegisterOperators() (in /usr/local/libtorch/lib/libtorch_cpu.so)
==369036==    by 0x1C5AE15D: __cxa_finalize (cxa_finalize.c:83)
==369036==    by 0x6040782: ??? (in /usr/local/libtorch/lib/libtorch_cpu.so)
==369036==    by 0x4011F5A: _dl_fini (dl-fini.c:138)
==369036==    by 0x1C5ADA26: __run_exit_handlers (exit.c:108)
==369036==    by 0x1C5ADBDF: exit (exit.c:139)
==369036==    by 0x1C58B0B9: (below main) (libc-start.c:342)
==369036==
==369036== Conditional jump or move depends on uninitialised value(s)
==369036==    at 0x8D8801D: torch::jit::deregisterOperator(c10::FunctionSchema const&) (in /usr/local/libtorch/lib/libtorch_cpu.so)
==369036==    by 0x6163A3C: c10::Dispatcher::deregisterDef_(c10::OperatorHandle const&, c10::OperatorName const&) (in /usr/local/libtorch/lib/libtorch_cpu.so)
==369036==    by 0x61A4674: c10::RegisterOperators::~RegisterOperators() (in /usr/local/libtorch/lib/libtorch_cpu.so)
==369036==    by 0x1C5AE15D: __cxa_finalize (cxa_finalize.c:83)
==369036==    by 0x6040782: ??? (in /usr/local/libtorch/lib/libtorch_cpu.so)
==369036==    by 0x4011F5A: _dl_fini (dl-fini.c:138)
==369036==    by 0x1C5ADA26: __run_exit_handlers (exit.c:108)
==369036==    by 0x1C5ADBDF: exit (exit.c:139)
==369036==    by 0x1C58B0B9: (below main) (libc-start.c:342)

This is using g++ 9.3.0, Ubuntu 20.04, libtorch 1.11.0.dev20211010+cu111

1 Like

Same problem on my side. I tried with SDL, SFML and GTKMM; libtroch seems to conflit with all of them. Have you find a solution?

After facing problems with SDL, SFML, GTKMM, QT and OpenCV, I may have figure it out… I have not tried with SDL but it works with GTKMM.

Basically, the libtorch that I was using was dependent on the old cxx11 ABI.

I downloaded the version of libtorch that uses the new cxx11 ABI at the following URL:
https://download.pytorch.org/libtorch/cpu/libtorch-cxx11-abi-shared-with-deps-1.10.2%2Bcpu.zip

When linking with this new version the segmentation faults disapeared.

Thanks for the update. I believe I am using the new ABI version. Maybe when I have some time I’ll try redownloading and seeing if anything changes. Not sure if there is an easy way to tell the version I have from the installed directory.